Commit Graph

18483 Commits

Author SHA1 Message Date
Linus Torvalds
2bb2c5e235 Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode loader updates from Ingo Molnar:
 "There are two main changes in this tree:

   - AMD microcode early loading fixes
   - some microcode loader source files reorganization"

* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, microcode: Move to a proper location
  x86, microcode, AMD: Fix early ucode loading
  x86, microcode: Share native MSR accessing variants
  x86, ramdisk: Export relocated ramdisk VA
2014-01-20 12:07:54 -08:00
Linus Torvalds
4500cf60db Merge branch 'x86-intel-mid-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull Intel MID updates from Ingo Molnar:
 "This tree improves Intel MID (Mobile Internet Device) platform
  support:

   - Merrifield platform support (David Cohen)
   - Clovertrail platform support (Kuppuswamy Sathyanarayanan)
   - Various cleanups and fixes (David Cohen)"

* 'x86-intel-mid-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, intel_mid: Replace memcpy with struct assignment
  x86, intel-mid: Return proper error code from get_gpio_by_name()
  x86, intel-mid: Check get_gpio_by_name() error code on platform code
  x86, intel-mid: sfi_handle_*_dev() should check for pdata error code
  x86, intel-mid: Remove deprecated X86_MDFLD and X86_WANT_INTEL_MID configs
  x86, intel-mid: Add Merrifield platform support
  x86, intel-mid: Add Clovertrail platform support
  x86, intel-mid: Move Medfield code out of intel-mid.c core file
2014-01-20 12:06:50 -08:00
Linus Torvalds
972d5e7e5b Merge branch 'x86-efi-kexec-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 EFI changes from Ingo Molnar:
 "This consists of two main parts:

   - New static EFI runtime services virtual mapping layout which is
     groundwork for kexec support on EFI (Borislav Petkov)

   - EFI kexec support itself (Dave Young)"

* 'x86-efi-kexec-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/efi: parse_efi_setup() build fix
  x86: ksysfs.c build fix
  x86/efi: Delete superfluous global variables
  x86: Reserve setup_data ranges late after parsing memmap cmdline
  x86: Export x86 boot_params to sysfs
  x86: Add xloadflags bit for EFI runtime support on kexec
  x86/efi: Pass necessary EFI data for kexec via setup_data
  efi: Export EFI runtime memory mapping to sysfs
  efi: Export more EFI table variables to sysfs
  x86/efi: Cleanup efi_enter_virtual_mode() function
  x86/efi: Fix off-by-one bug in EFI Boot Services reservation
  x86/efi: Add a wrapper function efi_map_region_fixed()
  x86/efi: Remove unused variables in __map_region()
  x86/efi: Check krealloc return value
  x86/efi: Runtime services virtual mapping
  x86/mm/cpa: Map in an arbitrary pgd
  x86/mm/pageattr: Add last levels of error path
  x86/mm/pageattr: Add a PUD error unwinding path
  x86/mm/pageattr: Add a PTE pagetable populating function
  x86/mm/pageattr: Add a PMD pagetable populating function
  ...
2014-01-20 12:05:30 -08:00
Linus Torvalds
5d4863e4cc Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 TLB detection update from Ingo Molnar:
 "A single change that extends our TLB cache size detection+reporting
  code"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, cpu: Detect more TLB configuration
2014-01-20 12:04:45 -08:00
Linus Torvalds
2a0fede97f Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Misc cleanups"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, cpu, amd: Fix a shadowed variable situation
  um, x86: Fix vDSO build
  x86: Delete non-required instances of include <linux/init.h>
  x86, realmode: Pointer walk cleanups, pull out invariant use of __pa()
  x86/traps: Clean up error exception handler definitions
2014-01-20 12:03:57 -08:00
Linus Torvalds
06bc0f4a2e Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/build changes from Ingo Molnar:
 "Misc smaller improvements"

* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, boot: Move intcall() to the .inittext section
  x86, boot: Use .code16 instead of .code16gcc
  x86, sparse: Do not force removal of __user when calling copy_to/from_user_nocheck()
2014-01-20 12:03:21 -08:00
Linus Torvalds
4cd4156994 Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/asm changes from Ingo Molnar:
 "Misc optimizations"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Slightly tweak the access_ok() C variant for better code
  x86: Replace assembly access_ok() with a C variant
  x86-64, copy_user: Use leal to produce 32-bit results
  x86-64, copy_user: Remove zero byte check before copy user buffer.
2014-01-20 11:51:00 -08:00
Linus Torvalds
1a7dbbcc8c Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic changes from Ingo Molnar:
 "Two main changes:

   - improve local APIC Error Status Register reporting robustness

   - add the 'disable_cpu_apicid=x' boot parameter for kexec booting"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, apic: Make disabled_cpu_apicid static read_mostly, fix typos
  x86, apic, kexec: Add disable_cpu_apicid kernel parameter
  x86/apic: Read Error Status Register correctly
2014-01-20 11:50:01 -08:00
Linus Torvalds
a0fa1dd3cd Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler changes from Ingo Molnar:

 - Add the initial implementation of SCHED_DEADLINE support: a real-time
   scheduling policy where tasks that meet their deadlines and
   periodically execute their instances in less than their runtime quota
   see real-time scheduling and won't miss any of their deadlines.
   Tasks that go over their quota get delayed (Available to privileged
   users for now)

 - Clean up and fix preempt_enable_no_resched() abuse all around the
   tree

 - Do sched_clock() performance optimizations on x86 and elsewhere

 - Fix and improve auto-NUMA balancing

 - Fix and clean up the idle loop

 - Apply various cleanups and fixes

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  sched: Fix __sched_setscheduler() nice test
  sched: Move SCHED_RESET_ON_FORK into attr::sched_flags
  sched: Fix up attr::sched_priority warning
  sched: Fix up scheduler syscall LTP fails
  sched: Preserve the nice level over sched_setscheduler() and sched_setparam() calls
  sched/core: Fix htmldocs warnings
  sched/deadline: No need to check p if dl_se is valid
  sched/deadline: Remove unused variables
  sched/deadline: Fix sparse static warnings
  m68k: Fix build warning in mac_via.h
  sched, thermal: Clean up preempt_enable_no_resched() abuse
  sched, net: Fixup busy_loop_us_clock()
  sched, net: Clean up preempt_enable_no_resched() abuse
  sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding
  sched/preempt, locking: Rework local_bh_{dis,en}able()
  sched/clock, x86: Avoid a runtime condition in native_sched_clock()
  sched/clock: Fix up clear_sched_clock_stable()
  sched/clock, x86: Use a static_key for sched_clock_stable
  sched/clock: Remove local_irq_disable() from the clocks
  sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs
  ...
2014-01-20 10:42:08 -08:00
Linus Torvalds
9326657abe Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel side changes:

   - Add Intel RAPL energy counter support (Stephane Eranian)
   - Clean up uprobes (Oleg Nesterov)
   - Optimize ring-buffer writes (Peter Zijlstra)

  Tooling side changes, user visible:

   - 'perf diff':
     - Add column colouring improvements (Ramkumar Ramachandra)

  - 'perf kvm':
     - Add guest related improvements, including allowing to specify a
       directory with guest specific /proc information (Dongsheng Yang)
     - Add shell completion support (Ramkumar Ramachandra)
     - Add '-v' option (Dongsheng Yang)
     - Support --guestmount (Dongsheng Yang)

   - 'perf probe':
     - Support showing source code, asking for variables to be collected
       at probe time and other 'perf probe' operations that use DWARF
       information.

       This supports only binaries with debugging information at this
       time, detached debuginfo (aka debuginfo packages) support should
       come in later patches (Masami Hiramatsu)

   - 'perf record':
     - Rename --no-delay option to --no-buffering, better reflecting its
       purpose and freeing up '--delay' to take the place of
       '--initial-delay', so that 'record' and 'stat' are consistent
       (Arnaldo Carvalho de Melo)
     - Default the -t/--thread option to no inheritance (Adrian Hunter)
     - Make per-cpu mmaps the default (Adrian Hunter)

   - 'perf report':
     - Improve callchain processing performance (Frederic Weisbecker)
     - Retain bfd reference to lookup source line numbers, greatly
       optimizing, among other use cases, 'perf report -s srcline'
       (Adrian Hunter)
     - Improve callchain processing performance even more (Namhyung Kim)
     - Add a perf.data file header window in the 'perf report' TUI,
       associated with the 'i' hotkey, providing a counterpart to the
       --header option in the stdio UI (Namhyung Kim)

   - 'perf script':
     - Add an option in 'perf script' to print the source line number
       (Adrian Hunter)
     - Add --header/--header-only options to 'script' and 'report', the
       default is not tho show the header info, but as this has been the
       default for some time, leave a single line explaining how to
       obtain that information (Jiri Olsa)
     - Add options to show comm, fork, exit and mmap PERF_RECORD_ events
       (Namhyung Kim)
     - Print callchains and symbols if they exist (David Ahern)

   - 'perf timechart'
     - Add backtrace support to CPU info
     - Print pid along the name
     - Add support for CPU topology
     - Add new option --highlight'ing threads, be it by name or, if a
       numeric value is provided, that run more than given duration
       (Stanislav Fomichev)

   - 'perf top':
     - Make 'perf top -g' refer to callchains, for consistency with
       other tools (David Ahern)

   - 'perf trace':
     - Handle old kernels where the "raw_syscalls" tracepoints were
       called plain "syscalls" (David Ahern)
     - Remove thread summary coloring, by Pekka Enberg.
     - Honour -m option in 'trace', the tool was offering the option to
       set the mmap size, but wasn't using it when doing the actual mmap
       on the events file descriptors (Jiri Olsa)

   - generic:
     - Backport libtraceevent plugin support (trace-cmd repository, with
       plugins for jbd2, hrtimer, kmem, kvm, mac80211, sched_switch,
       function, xen, scsi, cfg80211 (Jiri Olsa)
     - Print session information only if --stdio is given (Namhyung Kim)

  Tooling side changes, developer visible (plumbing):

   - Improve 'perf probe' exit path, release resources (Masami
     Hiramatsu)
   - Improve libtraceevent plugins exit path, allowing the registering
     of an unregister handler to be called at exit time (Namhyung Kim)
   - Add an alias to the build test makefile (make -C tools/perf
     build-test) (Namhyung Kim)
   - Get rid of die() and friends (good riddance!) in libtraceevent
     (Namhyung Kim)
   - Fix cross build problems related to pkgconfig and CROSS_COMPILE not
     being propagated to the feature tests, leading to features being
     tested in the host and then being enabled on the target (Mark
     Rutland)
   - Improve forked workload error reporting by sending the errno in the
     signal data queueing integer field, using sigqueue and by doing the
     signal setup in the evlist methods, removing open coded equivalents
     in various tools (Arnaldo Carvalho de Melo)
   - Do more auto exit cleanup chores in the 'evlist' destructor, so
     that the tools don't have to all do that sequence (Arnaldo Carvalho
     de Melo)
   - Pack 'struct perf_session_env' and 'struct trace' (Arnaldo Carvalho
     de Melo)
   - Add test for building detached source tarballs (Arnaldo Carvalho de
     Melo)
   - Move some header files (tools/perf/ to tools/include/ to make them
     available to other tools/ dwelling codebases (Namhyung Kim)
   - Move logic to warn about kptr_restrict'ed kernels to separate
     function in 'report' (Arnaldo Carvalho de Melo)
   - Move hist browser selection code to separate function (Arnaldo
     Carvalho de Melo)
   - Move histogram entries collapsing to separate function (Arnaldo
     Carvalho de Melo)
   - Introduce evlist__for_each() & friends (Arnaldo Carvalho de Melo)
   - Automate setup of FEATURE_CHECK_(C|LD)FLAGS-all variables (Jiri
     Olsa)
   - Move arch setup into seprate Makefile (Jiri Olsa)
   - Make libtraceevent install target quieter (Jiri Olsa)
   - Make tests/make output more compact (Jiri Olsa)
   - Ignore generated files in feature-checks (Chunwei Chen)
   - Introduce pevent_filter_strerror() in libtraceevent, similar in
     purpose to libc's strerror() function (Namhyung Kim)
   - Use perf_data_file methods to write output file in 'record' and
     'inject' (Jiri Olsa)
   - Use pr_*() functions where applicable in 'report' (Namhyumg Kim)
   - Add 'machine' 'addr_location' struct to have full picture (machine,
     thread, map, symbol, addr) for a (partially) resolved address,
     reducing function signatures (Arnaldo Carvalho de Melo)
   - Reduce code duplication in the histogram entry creation/insertion
     (Arnaldo Carvalho de Melo)
   - Auto allocate annotation histogram data structures (Arnaldo
     Carvalho de Melo)
   - No need to test against NULL before calling free, also set freed
     memory in struct pointers to NULL, to help fixing use after free
     bugs (Arnaldo Carvalho de Melo)
   - Rename some struct DSO binary_type related members and methods, to
     clarify its purpose and need for differentiation (symtab_type, ie
     one is about the files .text, CFI, etc, i.e.  its binary contents,
     and the other is about where the symbol table came from (Arnaldo
     Carvalho de Melo)
   - Convert to new topic libraries, starting with an API one (sysfs,
     debugfs, etc), renaming liblk in the process (Borislav Petkov)
   - Get rid of some more panic() like error handling in libtraceevent.
     (Namhyung Kim)
   - Get rid of panic() like calls in libtraceevent (Namyung Kim)
   - Start carving out symbol parsing routines (perf, just moving
     routines to topic files in tools/lib/symbol/, tools that want to
     use it need to integrate it directly, ie no
     tools/lib/symbol/Makefile is provided (Arnaldo Carvalho de Melo)
   - Assorted refactoring patches, moving code around and adding utility
     evlist methods that will be used in the IPT patchset (Adrian
     Hunter)
   - Assorted mmap_pages handling fixes (Adrian Hunter)
   - Several man pages typo fixes (Dongsheng Yang)
   - Get rid of several die() calls in libtraceevent (Namhyung Kim)
   - Use basename() in a more robust way, to avoid problems related to
     different system library implementations for that function
     (Stephane Eranian)
   - Remove open coded management of short_name_allocated member (Adrian
     Hunter)
   - Several cleanups in the "dso" methods, constifying some parameters
     and renaming some fields to clarify its purpose (Arnaldo Carvalho
     de Melo)
   - Add per-feature check flags, fixing libunwind related build
     problems on some architectures (Jean Pihet)
   - Do not disable source line lookup just because of one failure.
     (Adrian Hunter)
   - Several 'perf kvm' man page corrections (Dongsheng Yang)
   - Correct the message in feature-libnuma checking, swowing the right
     devel package names for various distros (Dongsheng Yang)
   - Polish 'readn()' function and introduce its counterpart,
     'writen()' (Jiri Olsa)
   - Start moving timechart state from global variables to a 'perf_tool'
     derived 'timechart' struct (Arnaldo Carvalho de Melo)

  ... and lots of fixes and improvements I forgot to list"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (282 commits)
  perf tools: Remove unnecessary callchain cursor state restore on unmatch
  perf callchain: Spare double comparison of callchain first entry
  perf tools: Do proper comm override error handling
  perf symbols: Export elf_section_by_name and reuse
  perf probe: Release all dynamically allocated parameters
  perf probe: Release allocated probe_trace_event if failed
  perf tools: Add 'build-test' make target
  tools lib traceevent: Unregister handler when xen plugin is unloaded
  tools lib traceevent: Unregister handler when scsi plugin is unloaded
  tools lib traceevent: Unregister handler when jbd2 plugin is is unloaded
  tools lib traceevent: Unregister handler when cfg80211 plugin is unloaded
  tools lib traceevent: Unregister handler when mac80211 plugin is unloaded
  tools lib traceevent: Unregister handler when sched_switch plugin is unloaded
  tools lib traceevent: Unregister handler when kvm plugin is unloaded
  tools lib traceevent: Unregister handler when kmem plugin is unloaded
  tools lib traceevent: Unregister handler when hrtimer plugin is unloaded
  tools lib traceevent: Unregister handler when function plugin is unloaded
  tools lib traceevent: Add pevent_unregister_print_function()
  tools lib traceevent: Add pevent_unregister_event_handler()
  tools lib traceevent: fix pointer-integer size mismatch
  ...
2014-01-20 10:28:30 -08:00
Linus Torvalds
2cc3f16cad Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull IRQ changes from Ingo Molnar:
 "The only change in this cycle is a CPU hotplug related spurious
  warning fix"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/irq: Fix kbuild warning in smp_irq_move_cleanup_interrupt()
  x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs
2014-01-20 10:27:52 -08:00
Linus Torvalds
ad3ab302fd Merge branch 'core-stackprotector-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull strong stackprotector support from Ingo Molnar:
 "This tree adds a CONFIG_CC_STACKPROTECTOR_STRONG=y, a new, stronger
  stack canary checking method supported by the newest GCC versions (4.9
  and later).

  Here's the 'intensity comparison' between the various protection
  modes:

      - defconfig
        11430641 kernel text size
        36110 function bodies

      - defconfig + CONFIG_CC_STACKPROTECTOR_REGULAR
        11468490 kernel text size (+0.33%)
        1015 of 36110 functions are stack-protected (2.81%)

      - defconfig + CONFIG_CC_STACKPROTECTOR_STRONG via this patch
        11692790 kernel text size (+2.24%)
        7401 of 36110 functions are stack-protected (20.5%)

  the strong model comes with non-trivial costs, which is why we
  preserved the 'regular' and 'none' models as well"

* 'core-stackprotector-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  stackprotector: Introduce CONFIG_CC_STACKPROTECTOR_STRONG
  stackprotector: Unify the HAVE_CC_STACKPROTECTOR logic between architectures
2014-01-20 10:26:31 -08:00
Linus Torvalds
6ffbe7d1fa Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking changes from Ingo Molnar:
 - futex performance increases: larger hashes, smarter wakeups
 - mutex debugging improvements
 - lots of SMP ordering documentation updates
 - introduce the smp_load_acquire(), smp_store_release() primitives.
   (There are WIP patches that make use of them - not yet merged)
 - lockdep micro-optimizations
 - lockdep improvement: better cover IRQ contexts
 - liblockdep at last. We'll continue to monitor how useful this is

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  futexes: Fix futex_hashsize initialization
  arch: Re-sort some Kbuild files to hopefully help avoid some conflicts
  futexes: Avoid taking the hb->lock if there's nothing to wake up
  futexes: Document multiprocessor ordering guarantees
  futexes: Increase hash table size for better performance
  futexes: Clean up various details
  arch: Introduce smp_load_acquire(), smp_store_release()
  arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h
  arch: Move smp_mb__{before,after}_atomic_{inc,dec}.h into asm/atomic.h
  locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
  mutexes: Give more informative mutex warning in the !lock->owner case
  powerpc: Full barrier for smp_mb__after_unlock_lock()
  rcu: Apply smp_mb__after_unlock_lock() to preserve grace periods
  Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK
  locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier
  Documentation/memory-barriers.txt: Document ACCESS_ONCE()
  Documentation/memory-barriers.txt: Prohibit speculative writes
  Documentation/memory-barriers.txt: Add long atomic examples to memory-barriers.txt
  Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt
  Revert "smp/cpumask: Make CONFIG_CPUMASK_OFFSTACK=y usable without debug dependency"
  ...
2014-01-20 10:23:08 -08:00
Linus Torvalds
16ec54ad15 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:

 - an s2ram related fix on AMD systems

 - a perf fault handling bug that is relatively old but which has become
   much easier to trigger in v3.13 after commit e00b12e64b ("perf/x86:
   Further optimize copy_from_user_nmi()")

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h
  x86, mm, perf: Allow recursive faults from interrupts
2014-01-19 13:06:51 -08:00
Linus Torvalds
7d0d46da75 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) The value choosen for the new SO_MAX_PACING_RATE socket option on
    parisc was very poorly choosen, let's fix it while we still can.
    From Eric Dumazet.

 2) Our generic reciprocal divide was found to handle some edge cases
    incorrectly, part of this is encoded into the BPF as deep as the JIT
    engines themselves.  Just use a real divide throughout for now.
    From Eric Dumazet.

 3) Because the initial lookup is lockless, the TCP metrics engine can
    end up creating two entries for the same lookup key.  Fix this by
    doing a second lookup under the lock before we actually create the
    new entry.  From Christoph Paasch.

 4) Fix scatter-gather list init in usbnet driver, from Bjørn Mork.

 5) Fix unintended 32-bit truncation in cxgb4 driver's bit shifting.
    From Dan Carpenter.

 6) Netlink socket dumping uses the wrong socket state for timewait
    sockets.  Fix from Neal Cardwell.

 7) Fix netlink memory leak in ieee802154_add_iface(), from Christian
    Engelmayer.

 8) Multicast forwarding in ipv4 can overflow the per-rule reference
    counts, causing all multicast traffic to cease.  Fix from Hannes
    Frederic Sowa.

 9) via-rhine needs to stop all TX queues when it resets the device,
    from Richard Weinberger.

10) Fix RDS per-cpu accesses broken by the this_cpu_* conversions.  From
    Gerald Schaefer.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  s390/bpf,jit: fix 32 bit divisions, use unsigned divide instructions
  parisc: fix SO_MAX_PACING_RATE typo
  ipv6: simplify detection of first operational link-local address on interface
  tcp: metrics: Avoid duplicate entries with the same destination-IP
  net: rds: fix per-cpu helper usage
  e1000e: Fix compilation warning when !CONFIG_PM_SLEEP
  bpf: do not use reciprocal divide
  be2net: add dma_mapping_error() check for dma_map_page()
  bnx2x: Don't release PCI bars on shutdown
  net,via-rhine: Fix tx_timeout handling
  batman-adv: fix batman-adv header overhead calculation
  qlge: Fix vlan netdev features.
  net: avoid reference counter overflows on fib_rules in multicast forwarding
  dm9601: add USB IDs for new dm96xx variants
  MAINTAINERS: add virtio-dev ML for virtio
  ieee802154: Fix memory leak in ieee802154_add_iface()
  net: usbnet: fix SG initialisation
  inet_diag: fix inet_diag_dump_icsk() to use correct state for timewait sockets
  cxgb4: silence shift wrapping static checker warning
2014-01-17 22:19:28 -08:00
Linus Torvalds
8f211b6ccc Fix for a brown paper bag bug. Thanks to Drew Jones for noticing.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJS174dAAoJEBvWZb6bTYbywN0P/jiaK+ySw4YTxGggkRhi1WHy
 KnELWnPTpA1duucioBS8KSZDXN0SSf4KLio/ArOJee/uU9loY4HWMwpKjBTZkEU4
 lWI0WIagllxqBnxfFgdAVS6B/EUqie3qulZ8Hi72OoAEzUnLDsGAIIBzR5fdb1jg
 xkmdtPB2Lg+c+Lsuhg0MJegoymoLhEF82k/rBUEAzpjsL5N9DUt4n2rVLbECkfzm
 kLC31epho+PaSAmepAk5dbfh1H7ea4OV5HZeyyuIZqGVUgc95OU7VV3eIK+pI7Vu
 C6PvHaOnF6VHvnMN4tFSDRvADsb9TnxQN9zYiC10opjzdTwUiA+Y+jbIjyRkAtNY
 JfDRSIh3+w93K0sv5eICBp3cXcy584I+N5VC3EY0imv4DZNvlQfEWu1t9qSBhaAQ
 DQnIRhY1BpfK4Uu9MVBu8lTeo0VuefupFZ3vqlEQJ8ht2jgnYdQQMOaPNuzOP7NS
 WbOoDWGVfpfWN09mCV9OJ9WKjxO0nUeS4/xBqWIhrQKQY3oSvdjxIbqw7E3lj4Sp
 0uwzGqbwcJanYR0IYNmBfuyRtwk8SqgvIGGLtG4cnlAAXuwaG/v/+FpVCTCstB9n
 Ljxp9Y8dqAyTjrdw4Irp2rsIhbJpW6Ob4zcVeAfcNWjut+DBhbWl6VSe0k48A06h
 gO+vkx+qAg7FiF/vnRVN
 =ifJc
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fix from Paolo Bonzini:
 "Fix for a brown paper bag bug.  Thanks to Drew Jones for noticing"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: x86: fix apic_base enable check
2014-01-17 16:40:27 -08:00
Fengguang Wu
ee87c751d8 x86, intel_mid: Replace memcpy with struct assignment
This is a cleanup proposed by coccinelle. It replaces memcpy with struct
assignment on intel-mid's sfi layer.

Generated by: coccinelle/misc/memcpy-assign.cocci

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Link: http://lkml.kernel.org/r/1389917588-9785-1-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-16 16:11:10 -08:00
David Cohen
28c6a39b33 x86, intel-mid: Return proper error code from get_gpio_by_name()
This patch cleans up get_gpio_by_name() to return an error code
instead of hardcoded -1.

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1389913624-9149-4-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-16 15:07:36 -08:00
David Cohen
a957a14bb4 x86, intel-mid: Check get_gpio_by_name() error code on platform code
This patch does cleanup on all intel mid platform code that uses
gpio_get_by_name() function. From now on they should check for any error
code instead of only hardcoded -1.

There are no functional changes from this change.

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1389913624-9149-3-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-16 15:06:58 -08:00
David Cohen
acb20d7395 x86, intel-mid: sfi_handle_*_dev() should check for pdata error code
When Intel MID finds a match between SFI table from FW and registered
SFI devices, it will always register a device regardless the platform
code was successful or not.

This patch adds an extra option for platform code to return error code
and abort device registration on SFI table parsing.

This patch does not contain any functional changes for current intel
mid platform code.

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1389913624-9149-2-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-16 15:06:29 -08:00
Ingo Molnar
860fc2f264 Merge branch 'perf/urgent' into perf/core
Pick up the latest fixes, refresh the development tree.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-16 09:33:30 +01:00
Robert Richter
bee09ed91c perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h
On AMD family 10h we see following error messages while waking up from
S3 for all non-boot CPUs leading to a failed IBS initialization:

 Enabling non-boot CPUs ...
 smpboot: Booting Node 0 Processor 1 APIC 0x1
 [Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
 perf: IBS APIC setup failed on cpu #1
 process: Switch to broadcast mode on CPU1
 CPU1 is up
 ...
 ACPI: Waking up from system sleep state S3

Reason for this is that during suspend the LVT offset for the IBS
vector gets lost and needs to be reinialized while resuming.

The offset is read from the IBSCTL msr. On family 10h the offset needs
to be 1 as offset 0 is used for the MCE threshold interrupt, but
firmware assings it for IBS to 0 too. The kernel needs to reprogram
the vector. The msr is a readonly node msr, but a new value can be
written via pci config space access. The reinitialization is
implemented for family 10h in setup_ibs_ctl() which is forced during
IBS setup.

This patch fixes IBS setup after waking up from S3 by adding
resume/supend hooks for the boot cpu which does the offset
reinitialization.

Marking it as stable to let distros pick up this fix.

Signed-off-by: Robert Richter <rric@kernel.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org> v3.2..
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1389797849-5565-1-git-send-email-rric.net@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-16 09:19:50 +01:00
Peter Zijlstra
c026b3591e x86, mm, perf: Allow recursive faults from interrupts
Waiman managed to trigger a PMI while in a emulate_vsyscall() fault,
the PMI in turn managed to trigger a fault while obtaining a stack
trace. This triggered the sig_on_uaccess_error recursive fault logic
and killed the process dead.

Fix this by explicitly excluding interrupts from the recursive fault
logic.

Reported-and-Tested-by: Waiman Long <waiman.long@hp.com>
Fixes: e00b12e64b ("perf/x86: Further optimize copy_from_user_nmi()")
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20140110200603.GJ7572@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-16 09:19:48 +01:00
Linus Torvalds
9b6c4ea95f Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "Two fixes from lockdep coverage of seqlocks, which fix deadlocks on
  lockdep-enabled ARM systems"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched_clock: Disable seqlock lockdep usage in sched_clock()
  seqlock: Use raw_ prefix instead of _no_lockdep
2014-01-16 08:31:55 +07:00
Eric Dumazet
aee636c480 bpf: do not use reciprocal divide
At first Jakub Zawadzki noticed that some divisions by reciprocal_divide
were not correct. (off by one in some cases)
http://www.wireshark.org/~darkjames/reciprocal-buggy.c

He could also show this with BPF:
http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c

The reciprocal divide in linux kernel is not generic enough,
lets remove its use in BPF, as it is not worth the pain with
current cpus.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
Cc: Mircea Gherzan <mgherzan@gmail.com>
Cc: Daniel Borkmann <dxchgb@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Matt Evans <matt@ozlabs.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15 17:02:08 -08:00
David Cohen
4cb9b00f42 x86, intel-mid: Remove deprecated X86_MDFLD and X86_WANT_INTEL_MID configs
We want to support all Intel MID (Mobile Internet Device) platforms
with a single config selection. This patch removes deprecated
CONFIG_X86_MDFLD and X86_WANT_INTEL_MID options in favor of having
CONFIG_X86_INTEL_MID only.

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387244246-20714-1-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-15 14:38:58 -08:00
David Cohen
bc20aa48bb x86, intel-mid: Add Merrifield platform support
This code was partially based on Mark Brown's previous work.

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387224459-25746-4-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: Fei Yang <fei.yang@intel.com>
Cc: Mark F. Brown <mark.f.brown@intel.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-15 14:38:58 -08:00
Kuppuswamy Sathyanarayanan
85611e3feb x86, intel-mid: Add Clovertrail platform support
This patch adds Clovertrail support on intel-mid and makes it more
flexible to support other SoCs.

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387224459-25746-3-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-15 14:38:58 -08:00
David Cohen
ecd6910db9 x86, intel-mid: Move Medfield code out of intel-mid.c core file
In order make the driver more portable and support other Intel MID
(Mobile Internet Device) platforms we need to move Medfield code from
intel-mid.c core to its own mfld.c file.

This patch contains no functional changes.

Signed-off-by: David Cohen <david.a.cohen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387224459-25746-2-git-send-email-david.a.cohen@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-01-15 14:38:58 -08:00
H. Peter Anvin
5b4d1dbc24 x86, apic: Make disabled_cpu_apicid static read_mostly, fix typos
Make disabled_cpu_apicid static and read_mostly, and fix a couple of
typos.

Reported-by: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/20140115182511.GA22737@gmail.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
2014-01-15 13:02:08 -08:00
HATAYAMA Daisuke
151e0c7de6 x86, apic, kexec: Add disable_cpu_apicid kernel parameter
Add disable_cpu_apicid kernel parameter. To use this kernel parameter,
specify an initial APIC ID of the corresponding CPU you want to
disable.

This is mostly used for the kdump 2nd kernel to disable BSP to wake up
multiple CPUs without causing system reset or hang due to sending INIT
from AP to BSP.

Kdump users first figure out initial APIC ID of the BSP, CPU0 in the
1st kernel, for example from /proc/cpuinfo and then set up this kernel
parameter for the 2nd kernel using the obtained APIC ID.

However, doing this procedure at each boot time manually is awkward,
which should be automatically done by user-land service scripts, for
example, kexec-tools on fedora/RHEL distributions.

This design is more flexible than disabling BSP in kernel boot time
automatically in that in kernel boot time we have no choice but
referring to ACPI/MP table to obtain initial APIC ID for BSP, meaning
that the method is not applicable to the systems without such BIOS
tables.

One assumption behind this design is that users get initial APIC ID of
the BSP in still healthy state and so BSP is uniquely kept in
CPU0. Thus, through the kernel parameter, only one initial APIC ID can
be specified.

In a comparison with disabled_cpu_apicid, we use read_apic_id(), not
boot_cpu_physical_apicid, because on some platforms, the variable is
modified to the apicid reported as BSP through MP table and this
function is executed with the temporarily modified
boot_cpu_physical_apicid. As a result, disabled_cpu_apicid kernel
parameter doesn't work well for apicids of APs.

Fixing the wrong handling of boot_cpu_physical_apicid requires some
reviews and tests beyond some platforms and it could take some
time. The fix here is a kind of workaround to focus on the main topic
of this patch.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Link: http://lkml.kernel.org/r/20140115064458.1545.38775.stgit@localhost6.localdomain6
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-01-15 09:19:20 -08:00
Andrew Jones
0dce7cd67f kvm: x86: fix apic_base enable check
Commit e66d2ae7c6 moved the assignment
vcpu->arch.apic_base = value above a condition with
(vcpu->arch.apic_base ^ value), causing that check
to always fail. Use old_value, vcpu->arch.apic_base's
old value, in the condition instead.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-01-15 13:42:14 +01:00
Borislav Petkov
d139336700 x86, cpu, amd: Fix a shadowed variable situation
Having u32 and struct cpuinfo_x86 * by the same name is not very smart,
although it was ok in this case due to the limited scope of u32 c and it
being used only once in there.

Fix this.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1389786735-16751-1-git-send-email-bp@alien8.de
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-01-15 04:21:45 -08:00
Richard Weinberger
60283df7ac x86/apic: Read Error Status Register correctly
Currently we do a read, a dummy write and a final read to fetch
the error code. The value from the final read is taken.
This is not the recommended way and leads to corrupted/lost ESR
values.

Intel(c) 64 and IA-32 Architectures Software Developer's Manual,
Combined Volumes 1, 2ABC, 3ABC, Section 10.5.3 states:

  Before attempt to read from the ESR, software should first
  write to it. (The value written does not affect the values read
  subsequently; only zero may be written in x2APIC mode.) This
  write clears any previously logged errors and updates the ESR
  with any errors detected since the last write to the ESR.
  This write also rearms the APIC error interrupt triggering
  mechanism.

This patch removes the first read such that we are conform with
the manual.

On my (very old) Pentium MMX SMP system this patch fixes the
issue that APIC errors:

  a) are not always reported and
  b) are reported with false error numbers.

Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: seiji.aguchi@hds.com
Cc: rientjes@google.com
Cc: konrad.wilk@oracle.com
Cc: bp@alien8.de
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1389685487-20872-1-git-send-email-richard@nod.at
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-14 14:05:36 +01:00
Borislav Petkov
bad5fa631f x86, microcode: Move to a proper location
We've grown a bunch of microcode loader files all prefixed with
"microcode_". They should be under cpu/ because this is strictly
CPU-related functionality so do that and drop the prefix since they're
in their own directory now which gives that prefix. :)

While at it, drop MICROCODE_INTEL_LIB config item and stash the
functionality under CONFIG_MICROCODE_INTEL as it was its only user.

Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
2014-01-13 20:00:12 +01:00
Borislav Petkov
5335ba5cf4 x86, microcode, AMD: Fix early ucode loading
The original idea to use the microcode cache for the APs doesn't pan out
because we do memory allocation there very early and with IRQs disabled
and we don't want to involve GFP_ATOMIC allocations. Not if it can be
helped.

Thus, extend the caching of the BSP patch approach to the APs and
iterate over the ucode in the initrd instead of using the cache. We
still save the relevant patches to it but later, right before we
jettison the initrd.

While at it, fix early ucode loading on 32-bit too.

Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
2014-01-13 19:59:38 +01:00
Borislav Petkov
e1b43e3f13 x86, microcode: Share native MSR accessing variants
We want to use those in AMD's early loading path too. Also, add a
native_wrmsrl variant.

Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
2014-01-13 19:57:27 +01:00
Borislav Petkov
5aa3d718f2 x86, ramdisk: Export relocated ramdisk VA
The ramdisk can possibly get relocated if the whole image is not mapped.
And since we're going over it in the microcode loader and fishing out
the relevant microcode patches, we want access it at its new location.
Thus, export it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
2014-01-13 19:56:25 +01:00
Peter Zijlstra
8cb75e0c4e sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding
With various drivers wanting to inject idle time; we get people
calling idle routines outside of the idle loop proper.

Therefore we need to be extra careful about not missing
TIF_NEED_RESCHED -> PREEMPT_NEED_RESCHED propagations.

While looking at this, I also realized there's a small window in the
existing idle loop where we can miss TIF_NEED_RESCHED; when it hits
right after the tif_need_resched() test at the end of the loop but
right before the need_resched() test at the start of the loop.

So move preempt_fold_need_resched() out of the loop where we're
guaranteed to have TIF_NEED_RESCHED set.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-x9jgh45oeayzajz2mjt0y7d6@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 17:38:55 +01:00
Ingo Molnar
c9c8986847 Merge branch 'x86/idle' into sched/core
Merge these x86 specific bits - we are going to add generic bits as well.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 17:37:05 +01:00
Peter Zijlstra
10b033d434 sched/clock, x86: Avoid a runtime condition in native_sched_clock()
Use a static_key to avoid touching tsc_disabled and a runtime
condition in native_sched_clock() -- less cachelines touched is always
better.

                        MAINLINE   PRE       POST

    sched_clock_stable: 1          1         1
    (cold) sched_clock: 329841     215295    213039
    (cold) local_clock: 301773     220773    216084
    (warm) sched_clock: 38375      25659     25231
    (warm) local_clock: 100371     27242     27601
    (warm) rdtsc:       27340      24208     24203
    sched_clock_stable: 0          0         0
    (cold) sched_clock: 382634     237019    240055
    (cold) local_clock: 396890     294819    299942
    (warm) sched_clock: 38194      25609     25276
    (warm) local_clock: 143452     71232     73232
    (warm) rdtsc:       27345      24243     24244

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-hrz87bo37qke25bty6pnfy4b@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 15:13:17 +01:00
Peter Zijlstra
35af99e646 sched/clock, x86: Use a static_key for sched_clock_stable
In order to avoid the runtime condition and variable load turn
sched_clock_stable into a static_key.

Also provide a shorter implementation of local_clock() and
cpu_clock(int) when sched_clock_stable==1.

                        MAINLINE   PRE       POST

    sched_clock_stable: 1          1         1
    (cold) sched_clock: 329841     221876    215295
    (cold) local_clock: 301773     234692    220773
    (warm) sched_clock: 38375      25602     25659
    (warm) local_clock: 100371     33265     27242
    (warm) rdtsc:       27340      24214     24208
    sched_clock_stable: 0          0         0
    (cold) sched_clock: 382634     235941    237019
    (cold) local_clock: 396890     297017    294819
    (warm) sched_clock: 38194      25233     25609
    (warm) local_clock: 143452     71234     71232
    (warm) rdtsc:       27345      24245     24243

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-eummbdechzz37mwmpags1gjr@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 15:13:13 +01:00
Peter Zijlstra
20d1c86a57 sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs
Use a ring-buffer like multi-version object structure which allows
always having a coherent object; we use this to avoid having to
disable IRQs while reading sched_clock() and avoids a problem when
getting an NMI while changing the cyc2ns data.

                        MAINLINE   PRE        POST

    sched_clock_stable: 1          1          1
    (cold) sched_clock: 329841     331312     257223
    (cold) local_clock: 301773     310296     309889
    (warm) sched_clock: 38375      38247      25280
    (warm) local_clock: 100371     102713     85268
    (warm) rdtsc:       27340      27289      24247
    sched_clock_stable: 0          0          0
    (cold) sched_clock: 382634     372706     301224
    (cold) local_clock: 396890     399275     399870
    (warm) sched_clock: 38194      38124      25630
    (warm) local_clock: 143452     148698     129629
    (warm) rdtsc:       27345      27365      24307

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-s567in1e5ekq2nlyhn8f987r@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 15:13:06 +01:00
Prarit Bhargava
c7a730fa46 x86/irq: Fix kbuild warning in smp_irq_move_cleanup_interrupt()
Fengguang Wu's 0day kernel build service reported the following build warning:

  arch/x86/kernel/apic/io_apic.c:2211
  smp_irq_move_cleanup_interrupt() warn: always true condition '(irq <= -1) => (0-u32max <= (-1))'

because irq is defined as an unsigned int instead of an int.
Fix this trivial error by redefining irq as a signed int.  The
remaining consumers of the int are okay.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Link: http://lkml.kernel.org/r/1389620420-7110-1-git-send-email-prarit@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 15:08:37 +01:00
Peter Zijlstra
57c67da274 sched/clock, x86: Move some cyc2ns() code around
There are no __cycles_2_ns() users outside of arch/x86/kernel/tsc.c,
so move it there.

There are no cycles_2_ns() users.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-01lslnavfgo3kmbo4532zlcj@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 13:47:39 +01:00
Peter Zijlstra
5dd12c2152 sched/clock, x86: Use mul_u64_u32_shr() for native_sched_clock()
Use mul_u64_u32_shr() so that x86_64 can use a single 64x64->128 mul.

Before:

0000000000000560 <native_sched_clock>:
 560:   44 8b 1d 00 00 00 00    mov    0x0(%rip),%r11d        # 567 <native_sched_clock+0x7>
 567:   55                      push   %rbp
 568:   48 89 e5                mov    %rsp,%rbp
 56b:   45 85 db                test   %r11d,%r11d
 56e:   75 4f                   jne    5bf <native_sched_clock+0x5f>
 570:   0f 31                   rdtsc
 572:   89 c0                   mov    %eax,%eax
 574:   48 c1 e2 20             shl    $0x20,%rdx
 578:   48 c7 c1 00 00 00 00    mov    $0x0,%rcx
 57f:   48 09 c2                or     %rax,%rdx
 582:   48 c7 c7 00 00 00 00    mov    $0x0,%rdi
 589:   65 8b 04 25 00 00 00    mov    %gs:0x0,%eax
 590:   00
 591:   48 98                   cltq
 593:   48 8b 34 c5 00 00 00    mov    0x0(,%rax,8),%rsi
 59a:   00
 59b:   48 89 d0                mov    %rdx,%rax
 59e:   81 e2 ff 03 00 00       and    $0x3ff,%edx
 5a4:   48 c1 e8 0a             shr    $0xa,%rax
 5a8:   48 0f af 14 0e          imul   (%rsi,%rcx,1),%rdx
 5ad:   48 0f af 04 0e          imul   (%rsi,%rcx,1),%rax
 5b2:   5d                      pop    %rbp
 5b3:   48 03 04 3e             add    (%rsi,%rdi,1),%rax
 5b7:   48 c1 ea 0a             shr    $0xa,%rdx
 5bb:   48 01 d0                add    %rdx,%rax
 5be:   c3                      retq

After:

0000000000000550 <native_sched_clock>:
 550:   8b 3d 00 00 00 00       mov    0x0(%rip),%edi        # 556 <native_sched_clock+0x6>
 556:   55                      push   %rbp
 557:   48 89 e5                mov    %rsp,%rbp
 55a:   48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
 55e:   85 ff                   test   %edi,%edi
 560:   75 2c                   jne    58e <native_sched_clock+0x3e>
 562:   0f 31                   rdtsc
 564:   89 c0                   mov    %eax,%eax
 566:   48 c1 e2 20             shl    $0x20,%rdx
 56a:   48 09 c2                or     %rax,%rdx
 56d:   65 48 8b 04 25 00 00    mov    %gs:0x0,%rax
 574:   00 00
 576:   89 c0                   mov    %eax,%eax
 578:   48 f7 e2                mul    %rdx
 57b:   65 48 8b 0c 25 00 00    mov    %gs:0x0,%rcx
 582:   00 00
 584:   c9                      leaveq
 585:   48 0f ac d0 0a          shrd   $0xa,%rdx,%rax
 58a:   48 01 c8                add    %rcx,%rax
 58d:   c3                      retq

                        MAINLINE   POST

    sched_clock_stable: 1	   1
    (cold) sched_clock: 329841     331312
    (cold) local_clock: 301773     310296
    (warm) sched_clock: 38375      38247
    (warm) local_clock: 100371     102713
    (warm) rdtsc:       27340      27289
    sched_clock_stable: 0          0
    (cold) sched_clock: 382634     372706
    (cold) local_clock: 396890     399275
    (warm) sched_clock: 38194      38124
    (warm) local_clock: 143452     148698
    (warm) rdtsc:       27345      27365

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-piu203ses5y1g36bnyw2n16x@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 13:47:38 +01:00
Dario Faggioli
d50dde5a10 sched: Add new scheduler syscalls to support an extended scheduling parameters ABI
Add the syscalls needed for supporting scheduling algorithms
with extended scheduling parameters (e.g., SCHED_DEADLINE).

In general, it makes possible to specify a periodic/sporadic task,
that executes for a given amount of runtime at each instance, and is
scheduled according to the urgency of their own timing constraints,
i.e.:

 - a (maximum/typical) instance execution time,
 - a minimum interval between consecutive instances,
 - a time constraint by which each instance must be completed.

Thus, both the data structure that holds the scheduling parameters of
the tasks and the system calls dealing with it must be extended.
Unfortunately, modifying the existing struct sched_param would break
the ABI and result in potentially serious compatibility issues with
legacy binaries.

For these reasons, this patch:

 - defines the new struct sched_attr, containing all the fields
   that are necessary for specifying a task in the computational
   model described above;

 - defines and implements the new scheduling related syscalls that
   manipulate it, i.e., sched_setattr() and sched_getattr().

Syscalls are introduced for x86 (32 and 64 bits) and ARM only, as a
proof of concept and for developing and testing purposes. Making them
available on other architectures is straightforward.

Since no "user" for these new parameters is introduced in this patch,
the implementation of the new system calls is just identical to their
already existing counterpart. Future patches that implement scheduling
policies able to exploit the new data structure must also take care of
modifying the sched_*attr() calls accordingly with their own purposes.

Signed-off-by: Dario Faggioli <raistlin@linux.it>
[ Rewrote to use sched_attr. ]
Signed-off-by: Juri Lelli <juri.lelli@gmail.com>
[ Removed sched_setscheduler2() for now. ]
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1383831828-15501-3-git-send-email-juri.lelli@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 13:41:04 +01:00
Ingo Molnar
1c62448e39 Linux 3.13-rc8
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJS0miqAAoJEHm+PkMAQRiGbfgIAJSWEfo8ludknhPcHJabBtxu
 75SQAKJlL3sBVnxEc58Rtt8gsKYQIrm4IY5Slunklsn04RxuDUIQMgFoAYR5gQwz
 +Myqkw/HOqDe5VStGxtLYpWnfglxVwGDCd7ISfL9AOVy5adMWBxh4Tv+qqQc7aIZ
 eF7dy+DD+C6Q3Z5OoV8s0FZDxse29vOf17Nki7+7t8WMqyegYwjoOqNeqocGKsPi
 eHLrJgTl4T6jB4l9LKKC154DSKjKOTSwZMWgwK8mToyNLT/ufCiKgXloIjEvZZcY
 VVKUtncdHiTf+iqVojgpGBzOEeB5DM83iiapFeDiJg8C9yBzvT8lBtA9aPb5Wgw=
 =lEeV
 -----END PGP SIGNATURE-----

Merge tag 'v3.13-rc8' into core/locking

Refresh the tree with the latest fixes, before applying new changes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13 11:44:41 +01:00
Richard Weinberger
106553aa4e um, x86: Fix vDSO build
Commit 663b55b9b3 ("x86: Delete non-required instances of include <linux/init.h>")
broke the UML build.

  arch/x86/um/vdso/vdso.S: Assembler messages:
  arch/x86/um/vdso/vdso.S:2: Error: no such instruction: `__initdata'
  arch/x86/um/vdso/vdso.S:9: Error: no such instruction: `__finit'

UML's vDSO needs linux/init.h.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: user-mode-linux-devel@lists.sourceforge.net
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/1389538341-31383-1-git-send-email-richard@nod.at
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-12 16:47:31 +01:00
Prarit Bhargava
9345005f4e x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs
During heavy CPU-hotplug operations the following spurious kernel warnings
can trigger:

  do_IRQ: No ... irq handler for vector (irq -1)

  [ See: https://bugzilla.kernel.org/show_bug.cgi?id=64831 ]

When downing a cpu it is possible that there are unhandled irqs
left in the APIC IRR register.  The following code path shows
how the problem can occur:

 1. CPU 5 is to go down.

 2. cpu_disable() on CPU 5 executes with interrupt flag cleared
    by local_irq_save() via stop_machine().

 3. IRQ 12 asserts on CPU 5, setting IRR but not ISR because
    interrupt flag is cleared (CPU unabled to handle the irq)

 4. IRQs are migrated off of CPU 5, and the vectors' irqs are set
    to -1. 5. stop_machine() finishes cpu_disable()

 6. cpu_die() for CPU 5 executes in normal context.

 7. CPU 5 attempts to handle IRQ 12 because the IRR is set for
    IRQ 12.  The code attempts to find the vector's IRQ and cannot
    because it has been set to -1. 8. do_IRQ() warning displays
    warning about CPU 5 IRQ 12.

I added a debug printk to output which CPU & vector was
retriggered and discovered that that we are getting bogus
events.  I see a 100% correlation between this debug printk in
fixup_irqs() and the do_IRQ() warning.

This patchset resolves this by adding definitions for
VECTOR_UNDEFINED(-1) and VECTOR_RETRIGGERED(-2) and modifying
the code to use them.

Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=64831
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Rui Wang <rui.y.wang@intel.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Yang Zhang <yang.z.zhang@Intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: janet.morgan@Intel.com
Cc: tony.luck@Intel.com
Cc: ruiv.wang@gmail.com
Link: http://lkml.kernel.org/r/1388938252-16627-1-git-send-email-prarit@redhat.com
[ Cleaned up the code a bit. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-12 13:13:02 +01:00