Pull scheduler changes from Ingo Molnar:
- Add the initial implementation of SCHED_DEADLINE support: a real-time
scheduling policy where tasks that meet their deadlines and
periodically execute their instances in less than their runtime quota
see real-time scheduling and won't miss any of their deadlines.
Tasks that go over their quota get delayed (Available to privileged
users for now)
- Clean up and fix preempt_enable_no_resched() abuse all around the
tree
- Do sched_clock() performance optimizations on x86 and elsewhere
- Fix and improve auto-NUMA balancing
- Fix and clean up the idle loop
- Apply various cleanups and fixes
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
sched: Fix __sched_setscheduler() nice test
sched: Move SCHED_RESET_ON_FORK into attr::sched_flags
sched: Fix up attr::sched_priority warning
sched: Fix up scheduler syscall LTP fails
sched: Preserve the nice level over sched_setscheduler() and sched_setparam() calls
sched/core: Fix htmldocs warnings
sched/deadline: No need to check p if dl_se is valid
sched/deadline: Remove unused variables
sched/deadline: Fix sparse static warnings
m68k: Fix build warning in mac_via.h
sched, thermal: Clean up preempt_enable_no_resched() abuse
sched, net: Fixup busy_loop_us_clock()
sched, net: Clean up preempt_enable_no_resched() abuse
sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding
sched/preempt, locking: Rework local_bh_{dis,en}able()
sched/clock, x86: Avoid a runtime condition in native_sched_clock()
sched/clock: Fix up clear_sched_clock_stable()
sched/clock, x86: Use a static_key for sched_clock_stable
sched/clock: Remove local_irq_disable() from the clocks
sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs
...
Pull core locking changes from Ingo Molnar:
- futex performance increases: larger hashes, smarter wakeups
- mutex debugging improvements
- lots of SMP ordering documentation updates
- introduce the smp_load_acquire(), smp_store_release() primitives.
(There are WIP patches that make use of them - not yet merged)
- lockdep micro-optimizations
- lockdep improvement: better cover IRQ contexts
- liblockdep at last. We'll continue to monitor how useful this is
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
futexes: Fix futex_hashsize initialization
arch: Re-sort some Kbuild files to hopefully help avoid some conflicts
futexes: Avoid taking the hb->lock if there's nothing to wake up
futexes: Document multiprocessor ordering guarantees
futexes: Increase hash table size for better performance
futexes: Clean up various details
arch: Introduce smp_load_acquire(), smp_store_release()
arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h
arch: Move smp_mb__{before,after}_atomic_{inc,dec}.h into asm/atomic.h
locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
mutexes: Give more informative mutex warning in the !lock->owner case
powerpc: Full barrier for smp_mb__after_unlock_lock()
rcu: Apply smp_mb__after_unlock_lock() to preserve grace periods
Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK
locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier
Documentation/memory-barriers.txt: Document ACCESS_ONCE()
Documentation/memory-barriers.txt: Prohibit speculative writes
Documentation/memory-barriers.txt: Add long atomic examples to memory-barriers.txt
Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt
Revert "smp/cpumask: Make CONFIG_CPUMASK_OFFSTACK=y usable without debug dependency"
...
For SCC initialization we cannot assume that the control register is in
the correct state to accept a register pointer. So first read from the
control register in order to "sync" up.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fengguang Wu's kbuild test robot reported the following new m68k warnings:
In file included from drivers/nubus/nubus.c:22:0:
>> arch/m68k/include/asm/mac_via.h:262:47: warning: 'struct irq_desc' declared inside parameter list [enabled by default]
>> arch/m68k/include/asm/mac_via.h:262:47: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
Caused by the reworking of the generic local_bh{dis,en}able() code.
To fix it, forward declare 'struct irq_desc'.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Fixes: c795eb55e740 ("sched/preempt, locking: Rework local_bh_{dis,en}able()")
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: geert@linux-m68k.org
Link: http://lkml.kernel.org/r/20140112212456.GQ7572@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some Atari hardware has no capacity to raise interrupts (e.g.
network or USB adapter hardware attached via ROM port). The driver
interrupt routine is called from a timer interrupt (timer D) in
these cases, using chained device specific pseudo interrupts
(IRQ_MFP_TIMER1 ff.)
These interrupts will more often than not, return IRQ_NONE as
there is not always work for the device handler when called.
Too many unhandled interrupts will result in the interrupt
being disabled by the stuck interrupt watchdog.
As preferred option to flag interrupts as needing exclusion
from the watchdog mechanism, tglx added the IRQ_IS_POLLED flag
for use in such a case. Currently, two interrupts need to use
this flag. Add more users as needed.
Signed-off-by: Michael Schmitz <schmitz@debian.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
We're going to be adding a few new barrier primitives, and in order to
avoid endless duplication make more agressive use of
asm-generic/barrier.h.
Change the asm-generic/barrier.h such that it allows partial barrier
definitions and fills out the rest with defaults.
There are a few architectures (m32r, m68k) that could probably
do away with their barrier.h file entirely but are kept for now due to
their unconventional nop() implementation.
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131213150640.846368594@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Also fix a few printf-style formats, to get rid of the following compiler
warnings when DEBUG is enabled:
arch/m68k/kernel/traps.c: In function ‘access_error060’:
arch/m68k/kernel/traps.c:166: warning: format ‘%d’ expects type ‘int’, but argument 2 has type ‘long unsigned int’
arch/m68k/kernel/traps.c: In function ‘bus_error030’:
arch/m68k/kernel/traps.c:568: warning: format ‘%#lx’ expects type ‘long unsigned int’, but argument 2 has type ‘void *’
arch/m68k/kernel/traps.c:682: warning: format ‘%#lx’ expects type ‘long unsigned int’, but argument 2 has type ‘void *’
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
When DEBUG is enabled, do_page_fault() may dereference a NULL pointer,
causing recursive bus errors.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Rename RTC_PORT() to ATARI_RTC_PORT(), as the rtc-cmos RTC driver uses the
presence of this macro to enable support for the second NVRAM bank, which
Atari doesn't have ("Unable to handle kernel access at virtual address
00ff8965").
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Since commit d6713b4091 ("m68k: early
parameter support"), the user can specify multiple debug consoles using the
"debug=" kernel command line parameter.
However, as there's only a single struct console object, which is reused,
it would actually register the same console object multiple times, causing
the following warning:
WARNING: CPU: 0 PID: 0 at kernel/printk/printk.c:2233 register_console+0x36/
console 'debug0' already registered
Make sure to register the console object only once, to avoid the warning.
Note that still only one console (the one corresponding to the last
"debug=" parameter) will be active at the same time, as the .write() method
of the already registered console object is overwritten by a subsequent
"debug=" parameter.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Add optional support to export the bootinfo used to boot the kernel in a
"bootinfo" file in procfs. This is useful with kexec.
This is based on the similar feature for ATAGS on ARM.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
On m68k, get_cycles() (the default implementation for random_get_entropy())
always returns zero, providing no entropy for the random driver.
Add a hook where platforms can provide their own implementation, and wire
it up in the infrastructure provided by commit
61875f30da ("random: allow architectures to
optionally define random_get_entropy()").
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
nf_init() uses virt_to_phys(), which depends on m68k_memoffset being set and
module_fixup() having been called, but this is only done in paging_init().
Hence call paging_init() before nf_init().
This went unnoticed, as virt_to_phys() is a no-op on Atari, unless you start
fiddling with the memory blocks in the bootinfo manually.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Move generic definitions used by bootstraps to uapi/asm/bootinfo.h:
- Machine types,
- CPU, FPU, and MMU types,
- struct mem_info.
Keep a copy of struct mem_info for in-kernel use, and rename it to struct
m68k_mem_info, as the exported one will be modified later.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Export the bootinfo definitions that are used by bootstrap loaders, and
split them up in generic and platform-specific parts.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
struct mac_booter_data is no longer part of the bootinfo API, hence move it
from <asm/bootinfo.h> to <asm/macintosh.h>, dropping all unused fields in
the process.
Also remove the no longer used mac_booter_data pointer from head.S.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Drop remainings and API for backwards compatibility with bootinfo interface
version 1.0. This was used when booting a 2.1.x or newer kernel on Amiga,
Atari, or Mac using a bootstrap for kernel 2.0.x.
Everybody upgraded his bootstrap a long time ago, so this can go.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Since the introduction of init sections (which are located after BSS), the
bootinfo is no longer located right after the BSS, but after all kernel
sections.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fix member definitions for non-native userspace handling:
- All multi-byte values are big-endian, hence use __be*,
- All pointers are 32-bit pointers under AmigaOS, but unused (except for
cd_BoardAddr) under Linux, hence use __be32.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
ZTWO_VADDR() converts from physical to virtual I/O addresses, so it should
return "void __iomem *" instead of "unsigned long".
This allows to drop several casts, but requires adding a few casts to
accomodate legacy driver frameworks that store "unsigned long" I/O
addresses.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Currently the array of Zorro devices is allocated statically, wasting
up to 4.5 KiB when running an Amiga or multi-platform kernel on a machine
with no or a handful of Zorro expansion cards. Convert it to conditional
dynamic memory allocation to fix this.
amiga_parse_bootinfo() still needs to store some information about the
detected Zorro devices, at a time even the bootmem allocator is not yet
available. This is now handled using a much smaller array (typically less
than 0.5 KiB), which is __initdata and thus freed later.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
As Sun 3 kernels cannot be multi-platform due to the different Sun 3 MMU
type, it made sense to statically allocate the table to track IOMMU use.
However, Sun 3x kernels can be multi-platform. Furthermore, Sun 3x uses
a larger table than Sun 3 (8192 bytes instead of 512 bytes).
Hence switch to dynamic allocation of this table using the bootmem
allocator to avoid wasting 8192 bytes when not running on a Sun 3x.
As this allocator returns zeroed memory, there's no need to explicitly
initialize the table to zeroes.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Pull irq cleanups from Ingo Molnar:
"This is a multi-arch cleanup series from Thomas Gleixner, which we
kept to near the end of the merge window, to not interfere with
architecture updates.
This series (motivated by the -rt kernel) unifies more aspects of IRQ
handling and generalizes PREEMPT_ACTIVE"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
preempt: Make PREEMPT_ACTIVE generic
sparc: Use preempt_schedule_irq
ia64: Use preempt_schedule_irq
m32r: Use preempt_schedule_irq
hardirq: Make hardirq bits generic
m68k: Simplify low level interrupt handling code
genirq: Prevent spurious detection for unconditionally polled interrupts
The low level interrupt entry code of m68k contains the following:
add_preempt_count(HARDIRQ_OFFSET);
do_IRQ();
irq_enter();
add_preempt_count(HARDIRQ_OFFSET);
handle_interrupt();
irq_exit();
sub_preempt_count(HARDIRQ_OFFSET);
if (in_interrupt())
return; <---- On m68k always taken!
if (local_softirq_pending())
do_softirq();
sub_preempt_count(HARDIRQ_OFFSET);
if (in_hardirq())
return;
if (status_on_stack_has_interrupt_priority_mask > 0)
return;
if (local_softirq_pending())
do_softirq();
ret_from_exception:
if (interrupted_context_is_kernel)
return:
....
I tried to find a proper explanation for this, but the changelog is
sparse and there are no mails explaining it further. But obviously
this relates to the interrupt priority levels of the m68k and tries to
be extra clever with nested interrupts. Though this cleverness just
adds code bloat to the interrupt hotpath.
For the common case of non nested interrupts the code runs through two
extra conditionals to the only important one, which checks whether the
return is to kernel or user space.
For the nested case the checks for in_hardirq() and the priority mask
value on stack catch only the case where the nested interrupt happens
inside the hard irq context of the first interrupt. If the nested
interrupt happens while the first interrupt handles soft interrupts,
then these extra checks buy nothing. The nested interrupt will fall
through to the final kernel/user space return check at
ret_from_exception.
Changing the code flow in the following way:
do_IRQ();
irq_enter();
add_preempt_count(HARDIRQ_OFFSET);
handle_interrupt();
irq_exit();
sub_preempt_count(HARDIRQ_OFFSET);
if (in_interrupt())
return;
if (local_softirq_pending())
do_softirq();
ret_from_exception:
if (interrupted_context_is_kernel)
return:
makes the region protected by the hardirq count slightly smaller and
the softirq handling is invoked with a minimal deeper stack. But
otherwise it's completely functional equivalent and saves 104 bytes of
text in arch/m68k/kernel/entry.o.
This modification allows us further to get rid of the limitations
which m68k puts on the preempt_count layout, so we can make the
preempt count bits completely generic.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Schmitz <schmitz@biophys.uni-duesseldorf.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Linux/m68k <linux-m68k@vger.kernel.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1311112052360.30673@ionos.tec.linutronix.de
Pull scheduler changes from Ingo Molnar:
"The main changes in this cycle are:
- (much) improved CONFIG_NUMA_BALANCING support from Mel Gorman, Rik
van Riel, Peter Zijlstra et al. Yay!
- optimize preemption counter handling: merge the NEED_RESCHED flag
into the preempt_count variable, by Peter Zijlstra.
- wait.h fixes and code reorganization from Peter Zijlstra
- cfs_bandwidth fixes from Ben Segall
- SMP load-balancer cleanups from Peter Zijstra
- idle balancer improvements from Jason Low
- other fixes and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHED
stop_machine: Fix race between stop_two_cpus() and stop_cpus()
sched: Remove unnecessary iteration over sched domains to update nr_busy_cpus
sched: Fix asymmetric scheduling for POWER7
sched: Move completion code from core.c to completion.c
sched: Move wait code from core.c to wait.c
sched: Move wait.c into kernel/sched/
sched/wait: Fix __wait_event_interruptible_lock_irq_timeout()
sched: Avoid throttle_cfs_rq() racing with period_timer stopping
sched: Guarantee new group-entities always have weight
sched: Fix hrtimer_cancel()/rq->lock deadlock
sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining
sched: Fix race on toggling cfs_bandwidth_used
sched: Remove extra put_online_cpus() inside sched_setaffinity()
sched/rt: Fix task_tick_rt() comment
sched/wait: Fix build breakage
sched/wait: Introduce prepare_to_wait_event()
sched/wait: Add ___wait_cond_timeout() to wait_event*_timeout() too
sched: Remove get_online_cpus() usage
sched: Fix race in migrate_swap_stop()
...