Commit Graph

4119 Commits

Author SHA1 Message Date
Linus Torvalds
aafcd5d757 Merge branch 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 relocation changes from Ingo Molnar:
 "This tree contains a single change, ELF relocation handling in C - one
  of the kernel randomization patches that makes sense even without
  randomization present upstream"

* 'x86-kaslr-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, relocs: Move ELF relocation handling to C
2013-09-04 09:38:10 -07:00
Linus Torvalds
228abe73ad Merge branch 'x86-fb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fb changes from Ingo Molnar:
 "This tree includes preparatory patches for SimpleDRM driver support,
  by David Herrmann.  They clean up x86 framebuffer support by creating
  simplefb devices wherever possible.  More background can be found at

     http://lwn.net/Articles/558104/"

* 'x86-fb-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  fbdev: fbcon: select VT_HW_CONSOLE_BINDING
  fbdev: efifb: bind to efi-framebuffer
  fbdev: vesafb: bind to platform-framebuffer device
  fbdev: simplefb: add common x86 RGB formats
  x86: sysfb: move EFI quirks from efifb to sysfb
  x86: provide platform-devices for boot-framebuffers
  fbdev: simplefb: mark as fw and allocate apertures
  fbdev: simplefb: add init through platform_data
2013-09-04 09:12:17 -07:00
Linus Torvalds
1f9c52e16b Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu feature fixes from Ingo Molnar:
 "Two small cpufeature support updates"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Fix override new_cpu_data.x86 with 486
  x86, cpufeature: Use new CC_HAVE_ASM_GOTO
2013-09-04 09:11:16 -07:00
Linus Torvalds
2a475501b8 Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/asmlinkage changes from Ingo Molnar:
 "As a preparation for Andi Kleen's LTO patchset (link time
  optimizations using GCC's -flto which build time optimization has
  steadily increased in quality over the past few years and might
  eventually be usable for the kernel too) this tree includes a handful
  of preparatory patches that make function calling convention
  annotations consistent again:

   - Mark every function without arguments (or 64bit only) that is used
     by assembly code with asmlinkage()

   - Mark every function with parameters or variables that is used by
     assembly code as __visible.

  For the vanilla kernel this has documentation, consistency and
  debuggability advantages, for the time being"

* 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asmlinkage: Fix warning in xen asmlinkage change
  x86, asmlinkage, vdso: Mark vdso variables __visible
  x86, asmlinkage, power: Make various symbols used by the suspend asm code visible
  x86, asmlinkage: Make dump_stack visible
  x86, asmlinkage: Make 64bit checksum functions visible
  x86, asmlinkage, paravirt: Add __visible/asmlinkage to xen paravirt ops
  x86, asmlinkage, apm: Make APM data structure used from assembler visible
  x86, asmlinkage: Make syscall tables visible
  x86, asmlinkage: Make several variables used from assembler/linker script visible
  x86, asmlinkage: Make kprobes code visible and fix assembler code
  x86, asmlinkage: Make various syscalls asmlinkage
  x86, asmlinkage: Make 32bit/64bit __switch_to visible
  x86, asmlinkage: Make _*_start_kernel visible
  x86, asmlinkage: Make all interrupt handlers asmlinkage / __visible
  x86, asmlinkage: Change dotraplinkage into __visible on 32bit
  x86: Fix sys_call_table type in asm/syscall.h
2013-09-04 08:42:44 -07:00
Linus Torvalds
3d7e5fc37f Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/asm changes from Ingo Molnar:
 "Main changes:

   - Apply low level mutex optimization on x86-64, by Wedson Almeida
     Filho.

   - Change bitops to be naturally 'long', by H Peter Anvin.

   - Add TSX-NI opcodes support to the x86 (instrumentation) decoder, by
     Masami Hiramatsu.

   - Add clang compatibility adjustments/workarounds, by Jan-Simon
     Möller"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, doc: Update uaccess.h comment to reflect clang changes
  x86, asm: Fix a compilation issue with clang
  x86, asm: Extend definitions of _ASM_* with a raw format
  x86, insn: Add new opcodes as of June, 2013
  x86/ia32/asm: Remove unused argument in macro
  x86, bitops: Change bitops to be native operand size
  x86: Use asm-goto to implement mutex fast path on x86-64
2013-09-04 08:39:38 -07:00
Linus Torvalds
6924a4672d Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic changes from Ingo Molnar:
 "Smaller fixes"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ioapic: Check attr against the previous setting when programmed more than once
  x86/ioapic/kcrash: Prevent crash_kexec() from deadlocking on ioapic_lock
  x86/acpi: Fix incorrect sanity check in acpi_register_lapic()
2013-09-04 08:39:05 -07:00
Linus Torvalds
5e0b3a4e88 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:
 "Various optimizations, cleanups and smaller fixes - no major changes
  in scheduler behavior"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix the sd_parent_degenerate() code
  sched/fair: Rework and comment the group_imb code
  sched/fair: Optimize find_busiest_queue()
  sched/fair: Make group power more consistent
  sched/fair: Remove duplicate load_per_task computations
  sched/fair: Shrink sg_lb_stats and play memset games
  sched: Clean-up struct sd_lb_stat
  sched: Factor out code to should_we_balance()
  sched: Remove one division operation in find_busiest_queue()
  sched/cputime: Use this_cpu_add() in task_group_account_field()
  cpumask: Fix cpumask leak in partition_sched_domains()
  sched/x86: Optimize switch_mm() for multi-threaded workloads
  generic-ipi: Kill unnecessary variable - csd_flags
  numa: Mark __node_set() as __always_inline
  sched/fair: Cleanup: remove duplicate variable declaration
  sched/__wake_up_sync_key(): Fix nr_exclusive tasks which lead to WF_SYNC clearing
2013-09-04 08:36:35 -07:00
Linus Torvalds
0d99b70873 Merge branches 'perf-urgent-for-linus' and 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf changes from Ingo Molnar:
 "As a first remark I'd like to point out that the obsolete '-f'
  (--force) option, which has not done anything for several releases,
  has been removed from 'perf record' and related utilities.  Everyone
  please update muscle memory accordingly! :-)

  Main changes on the perf kernel side:

   - Performance optimizations:
        . for trace events, by Steve Rostedt.
        . for time values, by Peter Zijlstra

   - New hardware support:
        . for Intel Silvermont (22nm Atom) CPUs, by Zheng Yan
        . for Intel SNB-EP uncore PMUs, by Zheng Yan

   - Enhanced hardware support:
        . for Intel uncore PMUs: add filter support for QPI boxes, by Zheng Yan

   - Core perf events code enhancements and fixes:
        . for full-nohz feature handling, by Frederic Weisbecker
        . for group events, by Jiri Olsa
        . for call chains, by Frederic Weisbecker
        . for event stream parsing, by Adrian Hunter

   - New ABI details:
        . Add attr->mmap2 attribute, by Stephane Eranian
        . Add PERF_EVENT_IOC_ID ioctl to return event ID, by Jiri Olsa
        . Export u64 time_zero on the mmap header page to allow TSC
          calculation, by Adrian Hunter
        . Add dummy software event, by Adrian Hunter.
        . Add a new PERF_SAMPLE_IDENTIFIER to make samples always
          parseable, by Adrian Hunter.
        . Make Power7 events available via sysfs, by Runzhen Wang.

   - Code cleanups and refactorings:
        . for nohz-full, by Frederic Weisbecker
        . for group events, by Jiri Olsa

   - Documentation updates:
        . for perf_event_type, by Peter Zijlstra

  Main changes on the perf tooling side (some of these tooling changes
  utilize the above kernel side changes):

   - Lots of 'perf trace' enhancements:

        . Make 'perf trace' command line arguments consistent with
          'perf record', by David Ahern.

        . Allow specifying syscalls a la strace, by Arnaldo Carvalho de Melo.

        . Add --verbose and -o/--output options, by Arnaldo Carvalho de Melo.

        . Support ! in -e expressions, to filter a list of syscalls,
          by Arnaldo Carvalho de Melo.

        . Arg formatting improvements to allow masking arguments in
          syscalls such as futex and open, where the some arguments are
          ignored and thus should not be printed depending on other args,
          by Arnaldo Carvalho de Melo.

        . Beautify futex open, openat, open_by_handle_at, lseek and futex
          syscalls, by Arnaldo Carvalho de Melo.

        . Add option to analyze events in a file versus live, so that
          one can do:

           [root@zoo ~]# perf record -a -e raw_syscalls:* sleep 1
           [ perf record: Woken up 0 times to write data ]
           [ perf record: Captured and wrote 25.150 MB perf.data (~1098836 samples) ]
           [root@zoo ~]# perf trace -i perf.data -e futex --duration 1
              17.799 ( 1.020 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, ua
             113.344 (95.429 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 4294967
             133.778 ( 1.042 ms): 18004 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 429496
           [root@zoo ~]#

          By David Ahern.

        . Honor target pid / tid options when analyzing a file, by David Ahern.

        . Introduce better formatting of syscall arguments, including so
          far beautifiers for mmap, madvise, syscall return values,
          by Arnaldo Carvalho de Melo.

        . Handle HUGEPAGE defines in the mmap beautifier, by David Ahern.

   - 'perf report/top' enhancements:

        . Do annotation using /proc/kcore and /proc/kallsyms when
          available, removing the forced need for a vmlinux file kernel
          assembly annotation. This also improves this use case because
          vmlinux has just the initial kernel image, not what is actually
          in use after various code patchings by things like alternatives.
          By Adrian Hunter.

        . Add --ignore-callees=<regex> option to collapse undesired parts
          of call graphs, by Greg Price.

        . Simplify symbol filtering by doing it at machine class level,
          by Adrian Hunter.

        . Add support for callchains in the gtk UI, by Namhyung Kim.

        . Add --objdump option to 'perf top', by Sukadev Bhattiprolu.

   - 'perf kvm' enhancements:

        . Add option to print only events that exceed a specified time
          duration, by David Ahern.

        . Improve stack trace printing, by David Ahern.

        . Update documentation of the live command, by David Ahern

        . Add perf kvm stat live mode that combines aspects of 'perf kvm
          stat' record and report, by David Ahern.

        . Add option to analyze specific VM in perf kvm stat report, by
          David Ahern.

        . Do not require /lib/modules/* on a guest, by Jason Wessel.

   - 'perf script' enhancements:

        . Fix symbol offset computation for some dsos, by David Ahern.

        . Fix named threads support, by David Ahern.

        . Don't install scripting files files when perl/python support
          is disabled, by Arnaldo Carvalho de Melo.

   - 'perf test' enhancements:

        . Add various improvements and fixes to the "vmlinux matches
          kallsyms" 'perf test' entry, related to the /proc/kcore
          annotation feature. By Adrian Hunter.

        . Add sample parsing test, by Adrian Hunter.

        . Add test for reading object code, by Adrian Hunter.

        . Add attr record group sampling test, by Jiri Olsa.

        . Misc testing infrastructure improvements and other details,
          by Jiri Olsa.

   - 'perf list' enhancements:

        . Skip unsupported hardware events, by Namhyung Kim.

        . List pmu events, by Andi Kleen.

   - 'perf diff' enhancements:

        . Add support for more than two files comparison, by Jiri Olsa.

   - 'perf sched' enhancements:

        . Various improvements, including removing reliance on some
          scheduler tracepoints that provide the same information as the
          PERF_RECORD_{FORK,EXIT} events. By David Ahern.

        . Remove odd build stall by moving a large struct initialization
          from a local variable to a global one, by Namhyung Kim.

   - 'perf stat' enhancements:

        . Add --initial-delay option to skip measuring for a defined
          startup phase, by Andi Kleen.

   - Generic perf tooling infrastructure/plumbing changes:

        . Tidy up sample parsing validation, by Adrian Hunter.

        . Fix up jobserver setup in libtraceevent Makefile.
          by Arnaldo Carvalho de Melo.

        . Debug improvements, by Adrian Hunter.

        . Fix correlation of samples coming after PERF_RECORD_EXIT event,
          by David Ahern.

        . Improve robustness of the topology parsing code,
          by Stephane Eranian.

        . Add group leader sampling, that allows just one event in a group
          to sample while the other events have just its values read,
          by Jiri Olsa.

        . Add support for a new modifier "D", which requests that the
          event, or group of events, be pinned to the PMU.
          By Michael Ellerman.

        . Support callchain sorting based on addresses, by Andi Kleen

        . Prep work for multi perf data file storage, by Jiri Olsa.

        . libtraceevent cleanups, by Namhyung Kim.

  And lots and lots of other fixes and code reorganizations that did not
  make it into the list, see the shortlog, diffstat and the Git log for
  details!"

[ Also merge a leftover from the 3.11 cycle ]

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Prevent race in unthrottling code

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (237 commits)
  perf trace: Tell arg formatters the arg index
  perf trace: Add beautifier for open's flags arg
  perf trace: Add beautifier for lseek's whence arg
  perf tools: Fix symbol offset computation for some dsos
  perf list: Skip unsupported events
  perf tests: Add 'keep tracking' test
  perf tools: Add support for PERF_COUNT_SW_DUMMY
  perf: Add a dummy software event to keep tracking
  perf trace: Add beautifier for futex 'operation' parm
  perf trace: Allow syscall arg formatters to mask args
  perf: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node()
  perf: Export struct perf_branch_entry to userspace
  perf: Add attr->mmap2 attribute to an event
  perf/x86: Add Silvermont (22nm Atom) support
  perf/x86: use INTEL_UEVENT_EXTRA_REG to define MSR_OFFCORE_RSP_X
  perf trace: Handle missing HUGEPAGE defines
  perf trace: Honor target pid / tid options when analyzing a file
  perf trace: Add option to analyze events in a file versus live
  perf evlist: Add tracepoint lookup by name
  perf tests: Add a sample parsing test
  ...
2013-09-04 08:25:35 -07:00
Linus Torvalds
40031da445 ACPI and power management updates for 3.12-rc1
1) ACPI-based PCI hotplug (ACPIPHP) subsystem rework and introduction
     of Intel Thunderbolt support on systems that use ACPI for signalling
     Thunderbolt hotplug events.  This also should make ACPIPHP work in
     some cases in which it was known to have problems.  From
     Rafael J Wysocki, Mika Westerberg and Kirill A Shutemov.
 
  2) ACPI core code cleanups and dock station support cleanups from
     Jiang Liu and Rafael J Wysocki.
 
  3) Fixes for locking problems related to ACPI device hotplug from
     Rafael J Wysocki.
 
  4) ACPICA update to version 20130725 includig fixes, cleanups, support
     for more than 256 GPEs per GPE block and a change to make the ACPI
     PM Timer optional (we've seen systems without the PM Timer in the
     field already).  One of the fixes, related to the DeRefOf operator,
     is necessary to prevent some Windows 8 oriented AML from causing
     problems to happen.  From Bob Moore, Lv Zheng, and Jung-uk Kim.
 
  5) Removal of the old and long deprecated /proc/acpi/event interface
     and related driver changes from Thomas Renninger.
 
  6) ACPI and Xen changes to make the reduced hardware sleep work with
     the latter from Ben Guthro.
 
  7) ACPI video driver cleanups and a blacklist of systems that should
     not tell the BIOS that they are compatible with Windows 8 (or ACPI
     backlight and possibly other things will not work on them).  From
     Felipe Contreras.
 
  8) Assorted ACPI fixes and cleanups from Aaron Lu, Hanjun Guo,
     Kuppuswamy Sathyanarayanan, Lan Tianyu, Sachin Kamat, Tang Chen,
     Toshi Kani, and Wei Yongjun.
 
  9) cpufreq ondemand governor target frequency selection change to
     reduce oscillations between min and max frequencies (essentially,
     it causes the governor to choose target frequencies proportional
     to load) from Stratos Karafotis.
 
 10) cpufreq fixes allowing sysfs attributes file permissions to be
     preserved over suspend/resume cycles Srivatsa S Bhat.
 
 11) Removal of Device Tree parsing for CPU device nodes from multiple
     cpufreq drivers that required some changes related to
     of_get_cpu_node() to be made in a few architectures and in the
     driver core.  From Sudeep KarkadaNagesha.
 
 12) cpufreq core fixes and cleanups related to mutual exclusion and
     driver module references from Viresh Kumar, Lukasz Majewski and
     Rafael J Wysocki.
 
 13) Assorted cpufreq fixes and cleanups from Amit Daniel Kachhap,
     Bartlomiej Zolnierkiewicz, Hanjun Guo, Jingoo Han, Joseph Lo,
     Julia Lawall, Li Zhong, Mark Brown, Sascha Hauer, Stephen Boyd,
     Stratos Karafotis, and Viresh Kumar.
 
 14) Fixes to prevent race conditions in coupled cpuidle from happening
     from Colin Cross.
 
 15) cpuidle core fixes and cleanups from Daniel Lezcano and
     Tuukka Tikkanen.
 
 16) Assorted cpuidle fixes and cleanups from Daniel Lezcano,
     Geert Uytterhoeven, Jingoo Han, Julia Lawall, Linus Walleij,
     and Sahara.
 
 17) System sleep tracing changes from Todd E Brandt and Shuah Khan.
 
 18) PNP subsystem conversion to using struct dev_pm_ops for power
     management from Shuah Khan.
 
 /
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABCAAGBQJSJcKhAAoJEKhOf7ml8uNsplIQAJSOshxhkkemvFOuHZ+0YIbh
 R9aufjXeDkMDBi8YtU+tB7ERth1j+0LUSM0NTnP51U7e+7eSGobA9s5jSZQj2l7r
 HFtnSOegLuKAfqwgfSLK91xa1rTFdfW0Kych9G2nuHtBIt6P0Oc59Cb5M0oy6QXs
 nVtaDEuU//tmO71+EF5HnMJHabRTrpvtn/7NbDUpU7LZYpWJrHJFT9xt1rXNab7H
 YRCATPm3kXGRg58Doc3EZE4G3D7DLvq74jWMaI089X/m5Pg1G6upqArypOy6oxdP
 p2FEzYVrb2bi8fakXp7BBeO1gCJTAqIgAkbSSZHLpGhFaeEMmb9/DWPXdm2TjzMV
 c1EEucvsqZWoprXgy12i5Hk814xN8d8nBBLg/UYiRJ44nc/hevXfyE9ZYj6bkseJ
 +GNHmZIa1QYC05nnGli4+W4kHns8EZf/gmvIxnPuco1RN2yMWagrud5/G6Dr9M2B
 hzJV6qauLVzgZso4oe79zv9aVxe/dPHKANLD/sg23WBiJJbJF1ocBlnj2Xlbpqze
 pmMUWGiO/gUiS0fmpW/lAJauza5jFmSCjE4E8R0Gyn0j4YXjmMhdEanaU6J3VuCi
 yVgEzYEth4sowq4AflMMLKYN+WmozDnK7taZRGmT0t+EKRFKLT6EgnNrkQgs1vKl
 oawD9LM4fZ8E0yroOEme
 =CgqW
 -----END PGP SIGNATURE-----

Merge tag 'pm+acpi-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI and power management updates from Rafael Wysocki:

 1) ACPI-based PCI hotplug (ACPIPHP) subsystem rework and introduction
    of Intel Thunderbolt support on systems that use ACPI for signalling
    Thunderbolt hotplug events.  This also should make ACPIPHP work in
    some cases in which it was known to have problems.  From
    Rafael J Wysocki, Mika Westerberg and Kirill A Shutemov.

 2) ACPI core code cleanups and dock station support cleanups from
    Jiang Liu and Rafael J Wysocki.

 3) Fixes for locking problems related to ACPI device hotplug from
    Rafael J Wysocki.

 4) ACPICA update to version 20130725 includig fixes, cleanups, support
    for more than 256 GPEs per GPE block and a change to make the ACPI
    PM Timer optional (we've seen systems without the PM Timer in the
    field already).  One of the fixes, related to the DeRefOf operator,
    is necessary to prevent some Windows 8 oriented AML from causing
    problems to happen.  From Bob Moore, Lv Zheng, and Jung-uk Kim.

 5) Removal of the old and long deprecated /proc/acpi/event interface
    and related driver changes from Thomas Renninger.

 6) ACPI and Xen changes to make the reduced hardware sleep work with
    the latter from Ben Guthro.

 7) ACPI video driver cleanups and a blacklist of systems that should
    not tell the BIOS that they are compatible with Windows 8 (or ACPI
    backlight and possibly other things will not work on them).  From
    Felipe Contreras.

 8) Assorted ACPI fixes and cleanups from Aaron Lu, Hanjun Guo,
    Kuppuswamy Sathyanarayanan, Lan Tianyu, Sachin Kamat, Tang Chen,
    Toshi Kani, and Wei Yongjun.

 9) cpufreq ondemand governor target frequency selection change to
    reduce oscillations between min and max frequencies (essentially,
    it causes the governor to choose target frequencies proportional
    to load) from Stratos Karafotis.

10) cpufreq fixes allowing sysfs attributes file permissions to be
    preserved over suspend/resume cycles Srivatsa S Bhat.

11) Removal of Device Tree parsing for CPU device nodes from multiple
    cpufreq drivers that required some changes related to
    of_get_cpu_node() to be made in a few architectures and in the
    driver core.  From Sudeep KarkadaNagesha.

12) cpufreq core fixes and cleanups related to mutual exclusion and
    driver module references from Viresh Kumar, Lukasz Majewski and
    Rafael J Wysocki.

13) Assorted cpufreq fixes and cleanups from Amit Daniel Kachhap,
    Bartlomiej Zolnierkiewicz, Hanjun Guo, Jingoo Han, Joseph Lo,
    Julia Lawall, Li Zhong, Mark Brown, Sascha Hauer, Stephen Boyd,
    Stratos Karafotis, and Viresh Kumar.

14) Fixes to prevent race conditions in coupled cpuidle from happening
    from Colin Cross.

15) cpuidle core fixes and cleanups from Daniel Lezcano and
    Tuukka Tikkanen.

16) Assorted cpuidle fixes and cleanups from Daniel Lezcano,
    Geert Uytterhoeven, Jingoo Han, Julia Lawall, Linus Walleij,
    and Sahara.

17) System sleep tracing changes from Todd E Brandt and Shuah Khan.

18) PNP subsystem conversion to using struct dev_pm_ops for power
    management from Shuah Khan.

* tag 'pm+acpi-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (217 commits)
  cpufreq: Don't use smp_processor_id() in preemptible context
  cpuidle: coupled: fix race condition between pokes and safe state
  cpuidle: coupled: abort idle if pokes are pending
  cpuidle: coupled: disable interrupts after entering safe state
  ACPI / hotplug: Remove containers synchronously
  driver core / ACPI: Avoid device hot remove locking issues
  cpufreq: governor: Fix typos in comments
  cpufreq: governors: Remove duplicate check of target freq in supported range
  cpufreq: Fix timer/workqueue corruption due to double queueing
  ACPI / EC: Add ASUSTEK L4R to quirk list in order to validate ECDT
  ACPI / thermal: Add check of "_TZD" availability and evaluating result
  cpufreq: imx6q: Fix clock enable balance
  ACPI: blacklist win8 OSI for buggy laptops
  cpufreq: tegra: fix the wrong clock name
  cpuidle: Change struct menu_device field types
  cpuidle: Add a comment warning about possible overflow
  cpuidle: Fix variable domains in get_typical_interval()
  cpuidle: Fix menu_device->intervals type
  cpuidle: CodingStyle: Break up multiple assignments on single line
  cpuidle: Check called function parameter in get_typical_interval()
  ...
2013-09-03 15:59:39 -07:00
Linus Torvalds
542a086ac7 Driver core patches for 3.12-rc1
Here's the big driver core pull request for 3.12-rc1.
 
 Lots of tiny changes here fixing up the way sysfs attributes are
 created, to try to make drivers simpler, and fix a whole class race
 conditions with creations of device attributes after the device was
 announced to userspace.
 
 All the various pieces are acked by the different subsystem maintainers.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.21 (GNU/Linux)
 
 iEYEABECAAYFAlIlIPcACgkQMUfUDdst+ynUMwCaAnITsxyDXYQ4DqEsz8EcOtMk
 718AoLrgnUZs3B+70AT34DVktg4HSThk
 =USl9
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core patches from Greg KH:
 "Here's the big driver core pull request for 3.12-rc1.

  Lots of tiny changes here fixing up the way sysfs attributes are
  created, to try to make drivers simpler, and fix a whole class race
  conditions with creations of device attributes after the device was
  announced to userspace.

  All the various pieces are acked by the different subsystem
  maintainers"

* tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (119 commits)
  firmware loader: fix pending_fw_head list corruption
  drivers/base/memory.c: introduce help macro to_memory_block
  dynamic debug: line queries failing due to uninitialized local variable
  sysfs: sysfs_create_groups returns a value.
  debugfs: provide debugfs_create_x64() when disabled
  rbd: convert bus code to use bus_groups
  firmware: dcdbas: use binary attribute groups
  sysfs: add sysfs_create/remove_groups for when SYSFS is not enabled
  driver core: add #include <linux/sysfs.h> to core files.
  HID: convert bus code to use dev_groups
  Input: serio: convert bus code to use drv_groups
  Input: gameport: convert bus code to use drv_groups
  driver core: firmware: use __ATTR_RW()
  driver core: core: use DEVICE_ATTR_RO
  driver core: bus: use DRIVER_ATTR_WO()
  driver core: create write-only attribute macros for devices and drivers
  sysfs: create __ATTR_WO()
  driver-core: platform: convert bus code to use dev_groups
  workqueue: convert bus code to use dev_groups
  MEI: convert bus code to use dev_groups
  ...
2013-09-03 11:37:15 -07:00
Linus Torvalds
bc08b449ee lockref: implement lockless reference count updates using cmpxchg()
Instead of taking the spinlock, the lockless versions atomically check
that the lock is not taken, and do the reference count update using a
cmpxchg() loop.  This is semantically identical to doing the reference
count update protected by the lock, but avoids the "wait for lock"
contention that you get when accesses to the reference count are
contended.

Note that a "lockref" is absolutely _not_ equivalent to an atomic_t.
Even when the lockref reference counts are updated atomically with
cmpxchg, the fact that they also verify the state of the spinlock means
that the lockless updates can never happen while somebody else holds the
spinlock.

So while "lockref_put_or_lock()" looks a lot like just another name for
"atomic_dec_and_lock()", and both optimize to lockless updates, they are
fundamentally different: the decrement done by atomic_dec_and_lock() is
truly independent of any lock (as long as it doesn't decrement to zero),
so a locked region can still see the count change.

The lockref structure, in contrast, really is a *locked* reference
count.  If you hold the spinlock, the reference count will be stable and
you can modify the reference count without using atomics, because even
the lockless updates will see and respect the state of the lock.

In order to enable the cmpxchg lockless code, the architecture needs to
do three things:

 (1) Make sure that the "arch_spinlock_t" and an "unsigned int" can fit
     in an aligned u64, and have a "cmpxchg()" implementation that works
     on such a u64 data type.

 (2) define a helper function to test for a spinlock being unlocked
     ("arch_spin_value_unlocked()")

 (3) select the "ARCH_USE_CMPXCHG_LOCKREF" config variable in its
     Kconfig file.

This enables it for x86-64 (but not 32-bit, we'd need to make sure
cmpxchg() turns into the proper cmpxchg8b in order to enable it for
32-bit mode).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-09-02 12:12:15 -07:00
H. Peter Anvin
f69fa9a91f x86, doc: Update uaccess.h comment to reflect clang changes
Update comment in uaccess.h to reflect the changes for clang support:
gcc only cares about the base register (most architectures don't
encode the size of the operation in the operands like x86 does, and so
it is treated effectively like a register number), whereas clang tries
to enforce the size -- but not for register pairs.

Link: http://lkml.kernel.org/r/1377803585-5913-3-git-send-email-dl9pf@gmx.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jan-Simon Möller <dl9pf@gmx.de>
2013-08-29 13:34:50 -07:00
Jan-Simon Möller
bdfc017eea x86, asm: Fix a compilation issue with clang
Clang does not support the "shortcut" we're taking here for gcc (see below).
The patch uses the macro _ASM_DX to do the job.

From arch/x86/include/asm/uaccess.h:
/*
 * Careful: we have to cast the result to the type of the pointer
 * for sign reasons.
 *
 * The use of %edx as the register specifier is a bit of a
 * simplification, as gcc only cares about it as the starting point
 * and not size: for a 64-bit value it will use %ecx:%edx on 32 bits
 * (%ecx being the next register in gcc's x86 register sequence), and
 * %rdx on 64 bits.
 */

[ hpa: I consider this a compatibility bug in clang as this reflects a
  bit of a misunderstanding about how register strings are used by
  gcc, but the workaround is straightforward and there is no
  particular reason to not do it. ]

Signed-off-by: Jan-Simon Möller <dl9pf@gmx.de>
Link: http://lkml.kernel.org/r/1377803585-5913-3-git-send-email-dl9pf@gmx.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-29 13:26:33 -07:00
Jan-Simon Möller
3e9b2327b5 x86, asm: Extend definitions of _ASM_* with a raw format
The __ASM_* macros (e.g. __ASM_DX) are used to return the proper
register name (e.g. edx for 32bit / rdx for 64bit). We want to use
this also in arch/x86/include/asm/uaccess.h / get_user() .  For this
to work, we need a raw form as both gcc and clang choke on the
whitespace in a register asm() statement, and the __ASM_FORM macro
surrounds the argument with blanks.  A new macro, __ASM_FORM_RAW was
added and we change __ASM_REG to use the new RAW form.

Signed-off-by: Jan-Simon Möller <dl9pf@gmx.de>
Link: http://lkml.kernel.org/r/1377803585-5913-2-git-send-email-dl9pf@gmx.de
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-29 13:26:32 -07:00
Ingo Molnar
aee2bce3cf Merge branch 'linus' into perf/core
Pick up the latest upstream fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-29 12:02:08 +02:00
Rafael J. Wysocki
7a330a5416 Merge branch 'pm-cpufreq'
* pm-cpufreq: (60 commits)
  cpufreq: pmac32-cpufreq: remove device tree parsing for cpu nodes
  cpufreq: pmac64-cpufreq: remove device tree parsing for cpu nodes
  cpufreq: maple-cpufreq: remove device tree parsing for cpu nodes
  cpufreq: arm_big_little: remove device tree parsing for cpu nodes
  cpufreq: kirkwood-cpufreq: remove device tree parsing for cpu nodes
  cpufreq: spear-cpufreq: remove device tree parsing for cpu nodes
  cpufreq: highbank-cpufreq: remove device tree parsing for cpu nodes
  cpufreq: cpufreq-cpu0: remove device tree parsing for cpu nodes
  cpufreq: imx6q-cpufreq: remove device tree parsing for cpu nodes
  drivers/bus: arm-cci: avoid parsing DT for cpu device nodes
  ARM: mvebu: remove device tree parsing for cpu nodes
  ARM: topology: remove hwid/MPIDR dependency from cpu_capacity
  of/device: add helper to get cpu device node from logical cpu index
  driver/core: cpu: initialize of_node in cpu's device struture
  ARM: DT/kernel: define ARM specific arch_match_cpu_phys_id
  of: move of_get_cpu_node implementation to DT core library
  powerpc: refactor of_get_cpu_node to support other architectures
  openrisc: remove undefined of_get_cpu_node declaration
  microblaze: remove undefined of_get_cpu_node declaration
  cpufreq: fix bad unlock balance on !CONFIG_SMP
  ...
2013-08-27 01:44:40 +02:00
Rafael J. Wysocki
4eb5178c9c Merge back earlier 'pm-cpufreq' material. 2013-08-23 00:55:13 +02:00
Yoshihiro YUNOMAE
17405453f4 x86/ioapic/kcrash: Prevent crash_kexec() from deadlocking on ioapic_lock
Prevent crash_kexec() from deadlocking on ioapic_lock. When
crash_kexec() is executed on a CPU, the CPU will take ioapic_lock
in disable_IO_APIC(). So if the cpu gets an NMI while locking
ioapic_lock, a deadlock will happen.

In this patch, ioapic_lock is zapped/initialized before disable_IO_APIC().

You can reproduce this deadlock the following way:

1. Add mdelay(1000) after raw_spin_lock_irqsave() in
   native_ioapic_set_affinity()@arch/x86/kernel/apic/io_apic.c

   Although the deadlock can occur without this modification, it will increase
   the potential of the deadlock problem.

2. Build and install the kernel

3. Set up the OS which will run panic() and kexec when NMI is injected
    # echo "kernel.unknown_nmi_panic=1" >> /etc/sysctl.conf
    # vim /etc/default/grub
      add "nmi_watchdog=0 crashkernel=256M" in GRUB_CMDLINE_LINUX line
    # grub2-mkconfig

4. Reboot the OS

5. Run following command for each vcpu on the guest
    # while true; do echo <CPU num> > /proc/irq/<IO-APIC-edge or IO-APIC-fasteoi>/smp_affinitity; done;
   By running this command, cpus will get ioapic_lock for setting affinity.

6. Inject NMI (push a dump button or execute 'virsh inject-nmi <domain>' if you
   use VM). After injecting NMI, panic() is called in an nmi-handler context.
   Then, kexec will normally run in panic(), but the operation will be stopped
   by deadlock on ioapic_lock in crash_kexec()->machine_crash_shutdown()->
   native_machine_crash_shutdown()->disable_IO_APIC()->clear_IO_APIC()->
   clear_IO_APIC_pin()->ioapic_read_entry().

Signed-off-by: Yoshihiro YUNOMAE <yoshihiro.yunomae.ez@hitachi.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Link: http://lkml.kernel.org/r/20130820070107.28245.83806.stgit@yunodevel
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-20 09:26:33 +02:00
Linus Torvalds
7067552dfb Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Two AMD microcode loader fixes and an OLPC firmware support fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode, AMD: Fix early microcode loading
  x86, microcode, AMD: Make cpu_has_amd_erratum() use the correct struct cpuinfo_x86
  x86: Don't clear olpc_ofw_header when sentinel is detected
2013-08-19 09:18:29 -07:00
Linus Torvalds
f1d6e17f54 Merge branch 'akpm' (patches from Andrew Morton)
Merge a bunch of fixes from Andrew Morton.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  fs/proc/task_mmu.c: fix buffer overflow in add_page_map()
  arch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig"
  ocfs2: fix null pointer dereference in ocfs2_dir_foreach_blk_id()
  x86 get_unmapped_area(): use proper mmap base for bottom-up direction
  ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page
  ocfs2: Revert 40bd62e to avoid regression in extended allocation
  drivers/rtc/rtc-stmp3xxx.c: provide timeout for potentially endless loop polling a HW bit
  hugetlb: fix lockdep splat caused by pmd sharing
  aoe: adjust ref of head for compound page tails
  microblaze: fix clone syscall
  mm: save soft-dirty bits on file pages
  mm: save soft-dirty bits on swapped pages
  memcg: don't initialize kmem-cache destroying work for root caches
2013-08-14 10:04:43 -07:00
Ingo Molnar
ccb1f55e71 Those are basically two fixes which correct the AMD early ucode loader
from accessing cpu_data too early, i.e. before smp_store_cpu_info()
 has copied the boot_cpu_data ontop and overwritten an already empty
 structure (which we shouldn't access that early in the first place
 anyway).
 
 The second patch is kinda largish for that late in the game but it
 shouldn't be problematic because we're simply switching from using
 cpu_data to use the CPU family number directly and thus again, not use
 uninitialized cpu_data structure.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSCT0UAAoJEBLB8Bhh3lVKWNsQAJbu1y79mc2vHR15NqMGXVoP
 F6sUNC3iUnNjxlUxWn3PPynsJS9zFTBEo1S6ILnsfuMWliDoH+4psHLMmG5F20TV
 XJCSS9s3fdsaJZiNa7JXggdI80xU3y8uB/z3nacRDKgcdw3tX+ITP4SdPgNF5F3k
 B/XGAFxyXfl9M9SRFMIO4GS9UyFMsDQP0VWMRhcosY1/Fj5bc9Uds1ngP9E4XMe9
 jB1pIzbvtDvtolYSdShwFE9M0os90TYPHVSgriYYXHLZPryZC8mWkc7GwSTl4bG9
 EwBGXv79SJWics9vWM2n9EQbFt46fUAyhpjyOo/fqVp4L5XjKuV7UyeT7X+bsY0q
 b3z1jWzBxGZ5ltE2BmZ8tVnfKgwYJlKgAf8FbSrylhtBInP80Wau1aYTzw50h9D2
 lmjO/rSOhm2FszdeFHG/IeVbSZJbSFrUdwm51nx2xF1qG0MXldy9ehJ4pO3Ui2XA
 d0z2y83ZxOYV723fTCa1/5C5xAes7LjvxftM32G9QTu7R5XnvVrfehVfPh9K1DJA
 nIEH6pbM358/PT+/q7BCPTnH7KVi+mE633YERDOfAgz/NFnVKfKzLBM0+AyFNHjk
 a4xFrOqVbTctW1Chv8Nh1457A2+gcT8yeju1XjbLE8IMqKOGyMt0gnRZlgMYj8BH
 WYKWi2q0LaDwsOcaQ9bm
 =cxbY
 -----END PGP SIGNATURE-----

Merge tag 'amd_ucode_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp into x86/urgent

Pull AMD microcode fixes from Borislav Petkov:

 " Those are basically two fixes which correct the AMD early ucode loader
   from accessing cpu_data too early, i.e. before smp_store_cpu_info()
   has copied the boot_cpu_data ontop and overwritten an already empty
   structure (which we shouldn't access that early in the first place
   anyway).

   The second patch is kinda largish for that late in the game but it
   shouldn't be problematic because we're simply switching from using
   cpu_data to use the CPU family number directly and thus again, not use
   uninitialized cpu_data structure. "

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-14 12:16:28 +02:00
Cyrill Gorcunov
41bb3476b3 mm: save soft-dirty bits on file pages
Andy reported that if file page get reclaimed we lose the soft-dirty bit
if it was there, so save _PAGE_BIT_SOFT_DIRTY bit when page address get
encoded into pte entry.  Thus when #pf happens on such non-present pte
we can restore it back.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 17:57:48 -07:00
Cyrill Gorcunov
179ef71cbc mm: save soft-dirty bits on swapped pages
Andy Lutomirski reported that if a page with _PAGE_SOFT_DIRTY bit set
get swapped out, the bit is getting lost and no longer available when
pte read back.

To resolve this we introduce _PTE_SWP_SOFT_DIRTY bit which is saved in
pte entry for the page being swapped out.  When such page is to be read
back from a swap cache we check for bit presence and if it's there we
clear it and restore the former _PAGE_SOFT_DIRTY bit back.

One of the problem was to find a place in pte entry where we can save
the _PTE_SWP_SOFT_DIRTY bit while page is in swap.  The _PAGE_PSE was
chosen for that, it doesn't intersect with swap entry format stored in
pte.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 17:57:47 -07:00
Oleg Nesterov
e0acd0a68e sched: fix the theoretical signal_wake_up() vs schedule() race
This is only theoretical, but after try_to_wake_up(p) was changed
to check p->state under p->pi_lock the code like

	__set_current_state(TASK_INTERRUPTIBLE);
	schedule();

can miss a signal. This is the special case of wait-for-condition,
it relies on try_to_wake_up/schedule interaction and thus it does
not need mb() between __set_current_state() and if(signal_pending).

However, this __set_current_state() can move into the critical
section protected by rq->lock, now that try_to_wake_up() takes
another lock we need to ensure that it can't be reordered with
"if (signal_pending(current))" check inside that section.

The patch is actually one-liner, it simply adds smp_wmb() before
spin_lock_irq(rq->lock). This is what try_to_wake_up() already
does by the same reason.

We turn this wmb() into the new helper, smp_mb__before_spinlock(),
for better documentation and to allow the architectures to change
the default implementation.

While at it, kill smp_mb__after_lock(), it has no callers.

Perhaps we can also add smp_mb__before/after_spinunlock() for
prepare_to_wait().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-08-13 08:19:26 -07:00
Torsten Kaiser
84516098b5 x86, microcode, AMD: Fix early microcode loading
load_microcode_amd() (and the helper it is using) should not have an
cpu parameter. The microcode loading does not depend on the CPU wrt the
patches loaded since they will end up in a global list for all CPUs
anyway.

The change from cpu to x86family in load_microcode_amd()
now allows to drop the code messing with cpu_data(cpu) from
collect_cpu_info_amd_early(), which is wrong anyway because at that
point the per-cpu cpu_info is not yet setup (These values would later be
overwritten by smp_store_boot_cpu_info() / smp_store_cpu_info()).

Fold the rest of collect_cpu_info_amd_early() into load_ucode_amd_ap(),
because its only used at one place and without the cpuinfo_x86 accesses
it was not much left.

Signed-off-by: Torsten Kaiser <just.for.lkml@googlemail.com>
[ Fengguang: build fix ]
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
[ Boris: adapt it to current tree. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2013-08-12 18:32:45 +02:00
Daniel Drake
d55e37bb0f x86: Don't clear olpc_ofw_header when sentinel is detected
OpenFirmware wasn't quite following the protocol described in boot.txt
and the kernel has detected this through use of the sentinel value
in boot_params. OFW does zero out almost all of the stuff that it should
do, but not the sentinel.

This causes the kernel to clear olpc_ofw_header, which breaks x86 OLPC
support.

OpenFirmware has now been fixed. However, it would be nice if we could
maintain Linux compatibility with old firmware versions. To do that, we just
have to avoid zeroing out olpc_ofw_header.

OFW does not write to any other parts of the header that are being zapped
by the sentinel-detection code, and all users of olpc_ofw_header are
somewhat protected through checking for the OLPC_OFW_SIG magic value
before using it. So this should not cause any problems for anyone.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/20130809221420.618E6FAB03@dev.laptop.org
Acked-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org> # v3.9+
2013-08-09 15:29:48 -07:00
Kees Cook
a021506107 x86, relocs: Move ELF relocation handling to C
Moves the relocation handling into C, after decompression. This requires
that the decompressed size is passed to the decompression routine as
well so that relocations can be found. Only kernels that need relocation
support will use the code (currently just x86_32), but this is laying
the ground work for 64-bit using it in support of KASLR.

Based on work by Neill Clift and Michael Davidson.

Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20130708161517.GA4832@www.outflux.net
Acked-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-07 21:00:04 -07:00
Andi Kleen
28596b6a87 x86, asmlinkage, vdso: Mark vdso variables __visible
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-17-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:21:08 -07:00
Andi Kleen
4a335c0695 x86, asmlinkage: Make 64bit checksum functions visible
They are implemented in assembler.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-14-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:20:59 -07:00
Andi Kleen
9a55fdbe94 x86, asmlinkage, paravirt: Add __visible/asmlinkage to xen paravirt ops
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-13-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:20:56 -07:00
Andi Kleen
277d5b40b7 x86, asmlinkage: Make several variables used from assembler/linker script visible
Plus one function, load_gs_index().

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-10-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:20:13 -07:00
Andi Kleen
04bb591ca7 x86, asmlinkage: Make kprobes code visible and fix assembler code
- Make all the external assembler template symbols __visible
- Move the templates inline assembler code into a top level
  assembler statement, not inside a function. This avoids it being
  optimized away or cloned.

Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-8-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:19:48 -07:00
Andi Kleen
ff49103fdb x86, asmlinkage: Make various syscalls asmlinkage
FWIW I suspect sys_rt_sigreturn/sys_sigreturn should use
standard SYSCALL wrappers.  But I didn't do that change in this
patch.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-7-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:18:33 -07:00
Andi Kleen
35ea7903b8 x86, asmlinkage: Make 32bit/64bit __switch_to visible
This function is called from inline assembler, so has to be visible.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-6-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:18:30 -07:00
Andi Kleen
a1ed4ddfb7 x86, asmlinkage: Make _*_start_kernel visible
Obviously these functions have to be visible, otherwise
the whole kernel could be optimized away.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-5-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:18:26 -07:00
Andi Kleen
1d9090e2fb x86, asmlinkage: Make all interrupt handlers asmlinkage / __visible
These handlers are all referenced from assembler stubs, so need
to be visible.

The handlers without arguments become asmlinkage, the others __visible
to not force regparms(0) on x86-32.

I put it all into a single patch, please let me know if you want
it it split up.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-4-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:18:23 -07:00
Andi Kleen
9e1a431de0 x86, asmlinkage: Change dotraplinkage into __visible on 32bit
Mark 32bit dotraplinkage functions as __visible for LTO.
64bit already is using asmlinkage which includes it.

v2: Clean up (M.Marek)

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-3-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-06 14:18:17 -07:00
Andi Kleen
1599e8fc84 x86: Fix sys_call_table type in asm/syscall.h
Make the sys_call_table type defined in asm/syscall.h match
the definition in syscall_64.c

v2: include asm/syscall.h in syscall_64.c too. I left uml alone
because it doesn't have an syscall.h on its own and including
the native one leads to other errors.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1375740170-7446-2-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Richard Weinberger <richard@nod.at>
2013-08-06 14:18:08 -07:00
David Herrmann
2995e50627 x86: sysfb: move EFI quirks from efifb to sysfb
The EFI FB quirks from efifb.c are useful for simple-framebuffer devices
as well. Apply them by default so we can convert efifb.c to use
efi-framebuffer platform devices.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Link: http://lkml.kernel.org/r/1375445127-15480-5-git-send-email-dh.herrmann@gmail.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-02 16:17:47 -07:00
David Herrmann
e3263ab389 x86: provide platform-devices for boot-framebuffers
The current situation regarding boot-framebuffers (VGA, VESA/VBE, EFI) on
x86 causes troubles when loading multiple fbdev drivers. The global
"struct screen_info" does not provide any state-tracking about which
drivers use the FBs. request_mem_region() theoretically works, but
unfortunately vesafb/efifb ignore it due to quirks for broken boards.

Avoid this by creating a platform framebuffer devices with a pointer
to the "struct screen_info" as platform-data. Drivers can now create
platform-drivers and the driver-core will refuse multiple drivers being
active simultaneously.

We keep the screen_info available for backwards-compatibility. Drivers
can be converted in follow-up patches.

Different devices are created for VGA/VESA/EFI FBs to allow multiple
drivers to be loaded on distro kernels. We create:
 - "vesa-framebuffer" for VBE/VESA graphics FBs
 - "efi-framebuffer" for EFI FBs
 - "platform-framebuffer" for everything else
This allows to load vesafb, efifb and others simultaneously and each
picks up only the supported FB types.

Apart from platform-framebuffer devices, this also introduces a
compatibility option for "simple-framebuffer" drivers which recently got
introduced for OF based systems. If CONFIG_X86_SYSFB is selected, we
try to match the screen_info against a simple-framebuffer supported
format. If we succeed, we create a "simple-framebuffer" device instead
of a platform-framebuffer.
This allows to reuse the simplefb.c driver across architectures and also
to introduce a SimpleDRM driver. There is no need to have vesafb.c,
efifb.c, simplefb.c and more just to have architecture specific quirks
in their setup-routines.

Instead, we now move the architecture specific quirks into x86-setup and
provide a generic simple-framebuffer. For backwards-compatibility (if
strange formats are used), we still allow vesafb/efifb to be loaded
simultaneously and pick up all remaining devices.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Link: http://lkml.kernel.org/r/1375445127-15480-4-git-send-email-dh.herrmann@gmail.com
Tested-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-08-02 16:17:46 -07:00
Rik van Riel
8f898fbbe5 sched/x86: Optimize switch_mm() for multi-threaded workloads
Dick Fowles, Don Zickus and Joe Mario have been working on
improvements to perf, and noticed heavy cache line contention
on the mm_cpumask, running linpack on a 60 core / 120 thread
system.

The cause turned out to be unnecessary atomic accesses to the
mm_cpumask. When in lazy TLB mode, the CPU is only removed from
the mm_cpumask if there is a TLB flush event.

Most of the time, no such TLB flush happens, and the kernel
skips the TLB reload. It can also skip the atomic memory
set & test.

Here is a summary of Joe's test results:

 * The __schedule function dropped from 24% of all program cycles down
   to 5.5%.

 * The cacheline contention/hotness for accesses to that bitmask went
   from being the 1st/2nd hottest - down to the 84th hottest (0.3% of
   all shared misses which is now quite cold)

 * The average load latency for the bit-test-n-set instruction in
   __schedule dropped from 10k-15k cycles down to an average of 600 cycles.

 * The linpack program results improved from 133 GFlops to 144 GFlops.
   Peak GFlops rose from 133 to 153.

Reported-by: Don Zickus <dzickus@redhat.com>
Reported-by: Joe Mario <jmario@redhat.com>
Tested-by: Joe Mario <jmario@redhat.com>
Signed-off-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Paul Turner <pjt@google.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20130731221421.616d3d20@annuminas.surriel.com
[ Made the comments consistent around the modified code. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-01 09:10:26 +02:00
Hanjun Guo
76f411fb3a x86 / cpu topology: remove the stale macro arch_provides_topology_pointers
Macro arch_provides_topology_pointers is pointless now, remove it.

Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-07-29 13:12:45 -07:00
Stratos Karafotis
61c63e5ed3 cpufreq: Remove unused APERF/MPERF support
The target frequency calculation method in the ondemand governor has
changed and it is now independent of the measured average frequency.
Consequently, the APERF/MPERF support in cpufreq is not used any
more, so drop it.

[rjw: Changelog]
Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-07-26 01:06:43 +02:00
Adrian Hunter
c73deb6aec perf/x86: Add ability to calculate TSC from perf sample timestamps
For modern CPUs, perf clock is directly related to TSC.  TSC
can be calculated from perf clock and vice versa using a simple
calculation.  Two of the three componenets of that calculation
are already exported in struct perf_event_mmap_page.  This patch
exports the third.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1372425741-1676-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-23 12:17:45 +02:00
Jiri Kosina
17f41571bb kprobes/x86: Call out into INT3 handler directly instead of using notifier
In fd4363fff3 ("x86: Introduce int3 (breakpoint)-based
instruction patching"), the mechanism that was introduced for
notifying alternatives code from int3 exception handler that and
exception occured was die_notifier.

This is however problematic, as early code might be using jump
labels even before the notifier registration has been performed,
which will then lead to an oops due to unhandled exception. One
of such occurences has been encountered by Fengguang:

 int3: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc1-01429-g04bf576 #8
 task: ffff88000da1b040 ti: ffff88000da1c000 task.ti: ffff88000da1c000
 RIP: 0010:[<ffffffff811098cc>]  [<ffffffff811098cc>] ttwu_do_wakeup+0x28/0x225
 RSP: 0000:ffff88000dd03f10  EFLAGS: 00000006
 RAX: 0000000000000000 RBX: ffff88000dd12940 RCX: ffffffff81769c40
 RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000001
 RBP: ffff88000dd03f28 R08: ffffffff8176a8c0 R09: 0000000000000002
 R10: ffffffff810ff484 R11: ffff88000dd129e8 R12: ffff88000dbc90c0
 R13: ffff88000dbc90c0 R14: ffff88000da1dfd8 R15: ffff88000da1dfd8
 FS:  0000000000000000(0000) GS:ffff88000dd00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 00000000ffffffff CR3: 0000000001c88000 CR4: 00000000000006e0
 Stack:
  ffff88000dd12940 ffff88000dbc90c0 ffff88000da1dfd8 ffff88000dd03f48
  ffffffff81109e2b ffff88000dd12940 0000000000000000 ffff88000dd03f68
  ffffffff81109e9e 0000000000000000 0000000000012940 ffff88000dd03f98
 Call Trace:
  <IRQ>
  [<ffffffff81109e2b>] ttwu_do_activate.constprop.56+0x6d/0x79
  [<ffffffff81109e9e>] sched_ttwu_pending+0x67/0x84
  [<ffffffff8110c845>] scheduler_ipi+0x15a/0x2b0
  [<ffffffff8104dfb4>] smp_reschedule_interrupt+0x38/0x41
  [<ffffffff8173bf5d>] reschedule_interrupt+0x6d/0x80
  <EOI>
  [<ffffffff810ff484>] ? __atomic_notifier_call_chain+0x5/0xc1
  [<ffffffff8105cc30>] ? native_safe_halt+0xd/0x16
  [<ffffffff81015f10>] default_idle+0x147/0x282
  [<ffffffff81017026>] arch_cpu_idle+0x3d/0x5d
  [<ffffffff81127d6a>] cpu_idle_loop+0x46d/0x5db
  [<ffffffff81127f5c>] cpu_startup_entry+0x84/0x84
  [<ffffffff8104f4f8>] start_secondary+0x3c8/0x3d5
  [...]

Fix this by directly calling poke_int3_handler() from the int3
exception handler (analogically to what ftrace has been doing
already), instead of relying on notifier, registration of which
might not have yet been finalized by the time of the first trap.

Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1307231007490.14024@pobox.suse.cz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-23 10:12:57 +02:00
Masami Hiramatsu
ea8596bb2d kprobes/x86: Remove unused text_poke_smp() and text_poke_smp_batch() functions
Since introducing the text_poke_bp() for all text_poke_smp*()
callers, text_poke_smp*() are now unused. This patch basically
reverts:

  3d55cc8a05 ("x86: Add text_poke_smp for SMP cross modifying code")
  7deb18dcf0 ("x86: Introduce text_poke_smp_batch() for batch-code modifying")

and related commits.

This patch also fixes a Kconfig dependency issue on STOP_MACHINE
in the case of CONFIG_SMP && !CONFIG_MODULE_UNLOAD.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Borislav Petkov <bpetkov@suse.de>
Link: http://lkml.kernel.org/r/20130718114753.26675.18714.stgit@mhiramat-M0-7522
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-19 09:57:04 +02:00
Ingo Molnar
9bb15425c3 Merge branch 'x86/jumplabel' into perf/core
Upcoming kprobes patches rely on the int3 code-patching machinery introduced by:

   fd4363fff3 x86: Introduce int3 (breakpoint)-based instruction patching

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-07-19 09:55:00 +02:00
Jiri Kosina
fd4363fff3 x86: Introduce int3 (breakpoint)-based instruction patching
Introduce a method for run-time instruction patching on a live SMP kernel
based on int3 breakpoint, completely avoiding the need for stop_machine().

The way this is achieved:

	- add a int3 trap to the address that will be patched
	- sync cores
	- update all but the first byte of the patched range
	- sync cores
	- replace the first byte (int3) by the first byte of
	  replacing opcode
	- sync cores

According to

	http://lkml.indiana.edu/hypermail/linux/kernel/1001.1/01530.html

synchronization after replacing "all but first" instructions should not
be necessary (on Intel hardware), as the syncing after the subsequent
patching of the first byte provides enough safety.
But there's not only Intel HW out there, and we'd rather be on a safe
side.

If any CPU instruction execution would collide with the patching,
it'd be trapped by the int3 breakpoint and redirected to the provided
"handler" (which would typically mean just skipping over the patched
region, acting as "nop" has been there, in case we are doing nop -> jump
and jump -> nop transitions).

Ftrace has been using this very technique since 08d636b ("ftrace/x86:
Have arch x86_64 use breakpoints instead of stop machine") for ages
already, and jump labels are another obvious potential user of this.

Based on activities of Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
a few years ago.

Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1307121102440.29788@pobox.suse.cz
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-07-16 17:55:29 -07:00
H. Peter Anvin
9b710506a0 x86, bitops: Change bitops to be native operand size
Change the bitops operation to be naturally "long", i.e. 63 bits on
the 64-bit kernel.  Additional bugs are likely to crop up in the
future.

We already have bugs which machines with > 16 TiB of memory in a
single node, as can happen if memory is interleaved.  The x86 bitop
operations take a signed index, so using an unsigned type is not an
option.

Jim Kukunas measured the effect of this patch on kernel size: it adds
2779 bytes to the allyesconfig kernel.  Some of that probably could be
elided by replacing the inline functions with macros which select the
32-bit type if the index is a 32-bit value, something like:

In that case we could also use "Jr" constraints for the 64-bit
version.

However, this would more than double the amount of code for a
relatively small gain.

Note that we can't use ilog2() for _BITOPS_LONG_SHIFT, as that causes
a recursive header inclusion problem.

The change to constant_test_bit() should both generate better code and
give correct result for negative bit indicies.  As previously written
the compiler had to generate extra code to create the proper wrong
result for negative values.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jim Kukunas <james.t.kukunas@intel.com>
Link: http://lkml.kernel.org/n/tip-z61ofiwe90xeyb461o72h8ya@git.kernel.org
2013-07-16 15:24:04 -07:00
Paul Gortmaker
148f9bb877 x86: delete __cpuinit usage from all x86 files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit  -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings.  In any case, they are temporary and harmless.

This removes all the arch/x86 uses of the __cpuinit macros from
all C files.  x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-07-14 19:36:56 -04:00