Expose the KVM guest CP0_Count frequency to userland via a new
KVM_REG_MIPS_COUNT_HZ register accessible with the KVM_{GET,SET}_ONE_REG
ioctls.
When the frequency is altered the bias is adjusted such that the guest
CP0_Count doesn't jump discontinuously or lose any timer interrupts.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose two new virtual registers to userland via the
KVM_{GET,SET}_ONE_REG ioctls.
KVM_REG_MIPS_COUNT_CTL is for timer configuration fields and just
contains a master disable count bit. This can be used by userland to
freeze the timer in order to read a consistent state from the timer
count value and timer interrupt pending bit. This cannot be done with
the CP0_Cause.DC bit because the timer interrupt pending bit (TI) is
also in CP0_Cause so it would be impossible to stop the timer without
also risking a race with an hrtimer interrupt and having to explicitly
check whether an interrupt should have occurred.
When the timer is re-enabled it resumes without losing time, i.e. the
CP0_Count value jumps to what it would have been had the timer not been
disabled, which would also be impossible to do from userland with
CP0_Cause.DC. The timer interrupt also cannot be lost, i.e. if a timer
interrupt would have occurred had the timer not been disabled it is
queued when the timer is re-enabled.
This works by storing the nanosecond monotonic time when the master
disable is set, and using it for various operations instead of the
current monotonic time (e.g. when recalculating the bias when the
CP0_Count is set), until the master disable is cleared again, i.e. the
timer state is read/written as it would have been at that time. This
state is exposed to userland via the read-only KVM_REG_MIPS_COUNT_RESUME
virtual register so that userland can determine the exact time the
master disable took effect.
This should allow userland to atomically save the state of the timer,
and later restore it.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Previously the emulation of the CPU timer was just enough to get a Linux
guest running but some shortcuts were taken:
- The guest timer interrupt was hard coded to always happen every 10 ms
rather than being timed to when CP0_Count would match CP0_Compare.
- The guest's CP0_Count register was based on the host's CP0_Count
register. This isn't very portable and fails on cores without a
CP_Count register implemented such as Ingenic XBurst. It also meant
that the guest's CP0_Cause.DC bit to disable the CP0_Count register
took no effect.
- The guest's CP0_Count register was emulated by just dividing the
host's CP0_Count register by 4. This resulted in continuity problems
when used as a clock source, since when the host CP0_Count overflows
from 0x7fffffff to 0x80000000, the guest CP0_Count transitions
discontinuously from 0x1fffffff to 0xe0000000.
Therefore rewrite & fix emulation of the guest timer based on the
monotonic kernel time (i.e. ktime_get()). Internally a 32-bit count_bias
value is added to the frequency scaled nanosecond monotonic time to get
the guest's CP0_Count. The frequency of the timer is initialised to
100MHz and cannot yet be changed, but a later patch will allow the
frequency to be configured via the KVM_{GET,SET}_ONE_REG ioctl
interface.
The timer can now be stopped via the CP0_Cause.DC bit (by the guest or
via the KVM_SET_ONE_REG ioctl interface), at which point the current
CP0_Count is stored and can be read directly. When it is restarted the
bias is recalculated such that the CP0_Count value is continuous.
Due to the nature of hrtimer interrupts any read of the guest's
CP0_Count register while it is running triggers a check for whether the
hrtimer has expired, so that the guest/userland cannot observe the
CP0_Count passing CP0_Compare without queuing a timer interrupt. This is
also taken advantage of when stopping the timer to ensure that a pending
timer interrupt is queued.
This replaces the implementation of:
- Guest read of CP0_Count
- Guest write of CP0_Count
- Guest write of CP0_Compare
- Guest write of CP0_Cause
- Guest read of HWR 2 (CC) with RDHWR
- Host read of CP0_Count via KVM_GET_ONE_REG ioctl interface
- Host write of CP0_Count via KVM_SET_ONE_REG ioctl interface
- Host write of CP0_Compare via KVM_SET_ONE_REG ioctl interface
- Host write of CP0_Cause via KVM_SET_ONE_REG ioctl interface
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The hrtimer callback for guest timer timeouts sets the guest's
CP0_Cause.TI bit to indicate to the guest that a timer interrupt is
pending, however there is no mutual exclusion implemented to prevent
this occurring while the guest's CP0_Cause register is being
read-modify-written elsewhere.
When this occurs the setting of the CP0_Cause.TI bit is undone and the
guest misses the timer interrupt and doesn't reprogram the CP0_Compare
register for the next timeout. Currently another timer interrupt will be
triggered again in another 10ms anyway due to the way timers are
emulated, but after the MIPS timer emulation is fixed this would result
in Linux guest time standing still and the guest scheduler not being
invoked until the guest CP0_Count has looped around again, which at
100MHz takes just under 43 seconds.
Currently this is the only asynchronous modification of guest registers,
therefore it is fixed by adjusting the implementations of the
kvm_set_c0_guest_cause(), kvm_clear_c0_guest_cause(), and
kvm_change_c0_guest_cause() macros which are used for modifying the
guest CP0_Cause register to use ll/sc to ensure atomic modification.
This should work in both UP and SMP cases without requiring interrupts
to be disabled.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0
UserLocal register. This is so that userland can save and restore its
value.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0
Count and Compare registers. These registers are special in that writing
to them has side effects (adjusting the time until the next timer
interrupt) and reading of Count depends on the time. Therefore add a
couple of callbacks so that different implementations (trap & emulate or
VZ) can implement them differently depending on what the hardware
provides.
The trap & emulate versions mostly duplicate what happens when a T&E
guest reads or writes these registers, so it inherits the same
limitations which can be fixed in later patches.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the KVM_{GET,SET}_ONE_REG MIPS register id definitions out of
kvm_mips.c to kvm_host.h so that they can be shared between multiple
source files. This allows register access to be indirected depending on
the underlying implementation (trap & emulate or VZ).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MIPS KVM uses mips32_SyncICache to synchronise the icache with the
dcache after dynamically modifying guest instructions or writing guest
exception vector. However this uses rdhwr to get the SYNCI step, which
causes a reserved instruction exception on Ingenic XBurst cores.
It would seem to make more sense to use local_flush_icache_range()
instead which does the same thing but is more portable.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull audit updates from Eric Paris.
* git://git.infradead.org/users/eparis/audit: (28 commits)
AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC
audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range
audit: do not cast audit_rule_data pointers pointlesly
AUDIT: Allow login in non-init namespaces
audit: define audit_is_compat in kernel internal header
kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c
sched: declare pid_alive as inline
audit: use uapi/linux/audit.h for AUDIT_ARCH declarations
syscall_get_arch: remove useless function arguments
audit: remove stray newline from audit_log_execve_info() audit_panic() call
audit: remove stray newlines from audit_log_lost messages
audit: include subject in login records
audit: remove superfluous new- prefix in AUDIT_LOGIN messages
audit: allow user processes to log from another PID namespace
audit: anchor all pid references in the initial pid namespace
audit: convert PPIDs to the inital PID namespace.
pid: get pid_t ppid of task in init_pid_ns
audit: rename the misleading audit_get_context() to audit_take_context()
audit: Add generic compat syscall support
audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL
...
Pull kvm updates from Paolo Bonzini:
"PPC and ARM do not have much going on this time. Most of the cool
stuff, instead, is in s390 and (after a few releases) x86.
ARM has some caching fixes and PPC has transactional memory support in
guests. MIPS has some fixes, with more probably coming in 3.16 as
QEMU will soon get support for MIPS KVM.
For x86 there are optimizations for debug registers, which trigger on
some Windows games, and other important fixes for Windows guests. We
now expose to the guest Broadwell instruction set extensions and also
Intel MPX. There's also a fix/workaround for OS X guests, nested
virtualization features (preemption timer), and a couple kvmclock
refinements.
For s390, the main news is asynchronous page faults, together with
improvements to IRQs (floating irqs and adapter irqs) that speed up
virtio devices"
* tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (96 commits)
KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8
KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset
KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode
KVM: PPC: Book3S HV: Return ENODEV error rather than EIO
KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code
KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state
KVM: PPC: Book3S HV: Add transactional memory support
KVM: Specify byte order for KVM_EXIT_MMIO
KVM: vmx: fix MPX detection
KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n
KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE
KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write
KVM: s390: clear local interrupts at cpu initial reset
KVM: s390: Fix possible memory leak in SIGP functions
KVM: s390: fix calculation of idle_mask array size
KVM: s390: randomize sca address
KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP
KVM: Bump KVM_MAX_IRQ_ROUTES for s390
KVM: s390: irq routing for adapter interrupts.
KVM: s390: adapter interrupt sources
...
Pull MIPS updates from Ralf Baechle:
- Support for Imgtec's Aptiv family of MIPS cores.
- Improved detection of BCM47xx configurations.
- Fix hiberation for certain configurations.
- Add support for the Chinese Loongson 3 CPU, a MIPS64 R2 core and
systems.
- Detection and support for the MIPS P5600 core.
- A few more random fixes that didn't make 3.14.
- Support for the EVA Extended Virtual Addressing
- Switch Alchemy to the platform PATA driver
- Complete unification of Alchemy support
- Allow availability of I/O cache coherency to be runtime detected
- Improvments to multiprocessing support for Imgtec platforms
- A few microoptimizations
- Cleanups of FPU support
- Paul Gortmaker's fixes for the init stuff
- Support for seccomp
* 'mips-for-linux-next' of git://git.linux-mips.org/pub/scm/ralf/upstream-sfr: (165 commits)
MIPS: CPC: Use __raw_ memory access functions
MIPS: CM: use __raw_ memory access functions
MIPS: Fix warning when including smp-ops.h with CONFIG_SMP=n
MIPS: Malta: GIC IPIs may be used without MT
MIPS: smp-mt: Use common GIC IPI implementation
MIPS: smp-cmp: Remove incorrect core number probe
MIPS: Fix gigaton of warning building with microMIPS.
MIPS: Fix core number detection for MT cores
MIPS: MT: core_nvpes function to retrieve VPE count
MIPS: Provide empty mips_mt_set_cpuoptions when CONFIG_MIPS_MT=n
MIPS: Lasat: Replace del_timer by del_timer_sync
MIPS: Malta: Setup PM I/O region on boot
MIPS: Loongson: Add a Loongson-3 default config file
MIPS: Loongson 3: Add CPU hotplug support
MIPS: Loongson 3: Add Loongson-3 SMP support
MIPS: Loongson: Add Loongson-3 Kconfig options
MIPS: Loongson: Add swiotlb to support All-Memory DMA
MIPS: Loongson 3: Add serial port support
MIPS: Loongson 3: Add IRQ init and dispatch support
MIPS: Loongson 3: Add HT-linked PCI support
...
Pull s390 compat wrapper rework from Heiko Carstens:
"S390 compat system call wrapper simplification work.
The intention of this work is to get rid of all hand written assembly
compat system call wrappers on s390, which perform proper sign or zero
extension, or pointer conversion of compat system call parameters.
Instead all of this should be done with C code eg by using Al's
COMPAT_SYSCALL_DEFINEx() macro.
Therefore all common code and s390 specific compat system calls have
been converted to the COMPAT_SYSCALL_DEFINEx() macro.
In order to generate correct code all compat system calls may only
have eg compat_ulong_t parameters, but no unsigned long parameters.
Those patches which change parameter types from unsigned long to
compat_ulong_t parameters are separate in this series, but shouldn't
cause any harm.
The only compat system calls which intentionally have 64 bit
parameters (preadv64 and pwritev64) in support of the x86/32 ABI
haven't been changed, but are now only available if an architecture
defines __ARCH_WANT_COMPAT_SYS_PREADV64/PWRITEV64.
System calls which do not have a compat variant but still need proper
zero extension on s390, like eg "long sys_brk(unsigned long brk)" will
get a proper wrapper function with the new s390 specific
COMPAT_SYSCALL_WRAPx() macro:
COMPAT_SYSCALL_WRAP1(brk, unsigned long, brk);
which generates the following code (simplified):
asmlinkage long sys_brk(unsigned long brk);
asmlinkage long compat_sys_brk(long brk)
{
return sys_brk((u32)brk);
}
Given that the C file which contains all the COMPAT_SYSCALL_WRAP lines
includes both linux/syscall.h and linux/compat.h, it will generate
build errors, if the declaration of sys_brk() doesn't match, or if
there exists a non-matching compat_sys_brk() declaration.
In addition this will intentionally result in a link error if
somewhere else a compat_sys_brk() function exists, which probably
should have been used instead. Two more BUILD_BUG_ONs make sure the
size and type of each compat syscall parameter can be handled
correctly with the s390 specific macros.
I converted the compat system calls step by step to verify the
generated code is correct and matches the previous code. In fact it
did not always match, however that was always a bug in the hand
written asm code.
In result we get less code, less bugs, and much more sanity checking"
* 'compat' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (44 commits)
s390/compat: add copyright statement
compat: include linux/unistd.h within linux/compat.h
s390/compat: get rid of compat wrapper assembly code
s390/compat: build error for large compat syscall args
mm/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
kexec/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
net/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
ipc/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
fs/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types
ipc/compat: convert to COMPAT_SYSCALL_DEFINE
fs/compat: convert to COMPAT_SYSCALL_DEFINE
security/compat: convert to COMPAT_SYSCALL_DEFINE
mm/compat: convert to COMPAT_SYSCALL_DEFINE
net/compat: convert to COMPAT_SYSCALL_DEFINE
kernel/compat: convert to COMPAT_SYSCALL_DEFINE
fs/compat: optional preadv64/pwrite64 compat system calls
ipc/compat_sys_msgrcv: change msgtyp type from long to compat_long_t
s390/compat: partial parameter conversion within syscall wrappers
s390/compat: automatic zero, sign and pointer conversion of syscalls
s390/compat: add sync_file_range and fallocate compat syscalls
...
Pull scheduler changes from Ingo Molnar:
"Bigger changes:
- sched/idle restructuring: they are WIP preparation for deeper
integration between the scheduler and idle state selection, by
Nicolas Pitre.
- add NUMA scheduling pseudo-interleaving, by Rik van Riel.
- optimize cgroup context switches, by Peter Zijlstra.
- RT scheduling enhancements, by Thomas Gleixner.
The rest is smaller changes, non-urgnt fixes and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits)
sched: Clean up the task_hot() function
sched: Remove double calculation in fix_small_imbalance()
sched: Fix broken setscheduler()
sparc64, sched: Remove unused sparc64_multi_core
sched: Remove unused mc_capable() and smt_capable()
sched/numa: Move task_numa_free() to __put_task_struct()
sched/fair: Fix endless loop in idle_balance()
sched/core: Fix endless loop in pick_next_task()
sched/fair: Push down check for high priority class task into idle_balance()
sched/rt: Fix picking RT and DL tasks from empty queue
trace: Replace hardcoding of 19 with MAX_NICE
sched: Guarantee task priority in pick_next_task()
sched/idle: Remove stale old file
sched: Put rq's sched_avg under CONFIG_FAIR_GROUP_SCHED
cpuidle/arm64: Remove redundant cpuidle_idle_call()
cpuidle/powernv: Remove redundant cpuidle_idle_call()
sched, nohz: Exclude isolated cores from load balancing
sched: Fix select_task_rq_fair() description comments
workqueue: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE
sys: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE
...
Pull core locking updates from Ingo Molnar:
"The biggest change is the MCS spinlock generalization changes from Tim
Chen, Peter Zijlstra, Jason Low et al. There's also lockdep
fixes/enhancements from Oleg Nesterov, in particular a false negative
fix related to lockdep_set_novalidate_class() usage"
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
locking/mutex: Fix debug checks
locking/mutexes: Add extra reschedule point
locking/mutexes: Introduce cancelable MCS lock for adaptive spinning
locking/mutexes: Unlock the mutex without the wait_lock
locking/mutexes: Modify the way optimistic spinners are queued
locking/mutexes: Return false if task need_resched() in mutex_can_spin_on_owner()
locking: Move mcs_spinlock.h into kernel/locking/
m68k: Skip futex_atomic_cmpxchg_inatomic() test
futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test
Revert "sched/wait: Suppress Sparse 'variable shadowing' warning"
lockdep: Change lockdep_set_novalidate_class() to use _and_name
lockdep: Change mark_held_locks() to check hlock->check instead of lockdep_no_validate
lockdep: Don't create the wrong dependency on hlock->check == 0
lockdep: Make held_lock->check and "int check" argument bool
locking/mcs: Allow architecture specific asm files to be used for contended case
locking/mcs: Order the header files in Kbuild of each architecture in alphabetical order
sched/wait: Suppress Sparse 'variable shadowing' warning
hung_task/Documentation: Fix hung_task_warnings description
locking/mcs: Allow architectures to hook in to contended paths
locking/mcs: Micro-optimize the MCS code, add extra comments
...
The CPC registers use native endianness, so using plain readl & writel
will produce incorrect results on big endian systems.
Reported-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Reported-by: Keng Koh <keng.koh@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6657/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The CM registers use native endianness, so using plain readl & writel
will produce incorrect results on big endian systems.
Reported-by: Jeffrey Deans <jeffrey.deans@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6656/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The gic_send_ipi_mask function declared in smp-ops.h takes a struct
cpumask argument, but linux/cpumask.h is only included within an #ifdef
CONFIG_SMP. Move the gic_ function declarations within that #ifdef too
to fix warnings during build such as:
In file included from arch/mips/fw/arc/init.c:15:0:
/mnt/buildbot/kernel/mips/slave/mips-linux__allno_/build/arch/mips/include/asm/smp-ops.h:62:44:
warning: 'struct cpumask' declared inside parameter list [enabled by
default]
extern void gic_send_ipi_mask(const struct cpumask *mask, unsigned int
action);
Reported-by: Markos Chandras <markos.chandras@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6655/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
With binutils 2.24 the attempt to switch with microMIPS mode to MIPS III
mode through .set mips3 results in *lots* of warnings like
{standard input}: Assembler messages:
{standard input}:397: Warning: the 64-bit MIPS architecture does not support the `smartmips' extension
during a kernel build. Fixed by using .set arch=r4000 instead.
This breaks support for building the kernel with binutils 2.13 which
was supported for 32 bit kernels only anyway and 2.14 which was a bad
vintage for MIPS anyway.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This function simply returns the number of VPEs present in the current
core, or 1 if the core does not implement the MT ASE. In SMP kernels
this will typically equal smp_num_siblings, however it will also be
usable in UP kernels and helps prepare for the possibility of a
heterogenous system where the VPE count is not the same across all
cores.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6665/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Both the CONFIG_MIPS_CPS & CONFIG_MIPS_CMP SMP implementations call
mips_mt_set_cpuoptions when preparing to start secondary CPUs. However
both may be used without MT. Provide an empty inline function to prevent
a link error in this case.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6647/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch ensures that the kernel sets a sane base address for the
PIIX4 PM I/O register region during boot. Without this the kernel may
not successfully claim the region as a resource if the bootloader didn't
configure the region. With this patch the kernel will always succeed
with:
pci 0000:00:0a.3: quirk: [io 0x1000-0x103f] claimed by PIIX4 ACPI
The lack of the resource claiming is easily reproducible without this
patch using current versions of QEMU.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Tested-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6641/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Tips of Loongson's CPU hotplug:
1, To fully shutdown a core in Loongson 3, the target core should go to
CKSEG1 and flush all L1 cache entries at first. Then, another core
(usually Core 0) can safely disable the clock of the target core. So
play_dead() call loongson3_play_dead() via CKSEG1 (both uncached and
unmmaped).
2, The default clocksource of Loongson is MIPS. Since clock source is a
global device, timekeeping need the CP0' Count registers of each core
be synchronous. Thus, when a core is up, we use a SMP_ASK_C0COUNT IPI
to ask Core-0's Count.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Hongliang Tao <taohl@lemote.com>
Signed-off-by: Hua Yan <yanh@lemote.com>
Tested-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Alex Smith <alex.smith@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6639
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
IPI registers of Loongson-3 include IPI_SET, IPI_CLEAR, IPI_STATUS,
IPI_EN and IPI_MAILBOX_BUF. Each bit of IPI_STATUS indicate a type of
IPI and IPI_EN indicate whether the IPI is enabled. The sender write 1
to IPI_SET bits generate IPIs in IPI_STATUS, and receiver write 1 to
bits of IPI_CLEAR to clear IPIs. IPI_MAILBOX_BUF are used to deliver
more information about IPIs.
Why we change code in arch/mips/loongson/common/setup.c?
If without this change, when SMP configured, system cannot boot since
it hang at printk() in cgroup_init_early(). The root cause is:
console_trylock()
\-->down_trylock(&console_sem)
\-->raw_spin_unlock_irqrestore(&sem->lock, flags)
\-->_raw_spin_unlock_irqrestore()(SMP/UP have different versions)
\-->__raw_spin_unlock_irqrestore() (following is the SMP case)
\-->do_raw_spin_unlock()
\-->arch_spin_unlock()
\-->nudge_writes()
\-->mb()
\-->wbflush()
\-->__wbflush()
In previous code __wbflush() is initialized in plat_mem_setup(), but
cgroup_init_early() is called before plat_mem_setup(). Therefore, In
this patch we make changes to avoid boot failure.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Hongliang Tao <taohl@lemote.com>
Signed-off-by: Hua Yan <yanh@lemote.com>
Tested-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Alex Smith <alex.smith@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6638
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Loongson doesn't support DMA address above 4GB traditionally. If memory
is more than 4GB, CONFIG_SWIOTLB and ZONE_DMA32 should be selected. In
this way, DMA pages are allocated below 4GB preferably. However, if low
memory is not enough, high pages are allocated and swiotlb is used for
bouncing.
Moreover, we provide a platform-specific dma_map_ops::set_dma_mask() to
set a device's dma_mask and coherent_dma_mask. We use these masks to
distinguishes an allocated page can be used for DMA directly, or need
swiotlb to bounce.
Recently, we found that 32-bit DMA isn't a hardware bug, but a hardware
configuration issue. So, latest firmware has enable the DMA support as
high as 40-bit. To support all-memory DMA for all devices (besides the
Loongson platform limit, there are still some devices have their own
DMA32 limit), and also to be compatible with old firmware, we keep use
swiotlb.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Hongliang Tao <taohl@lemote.com>
Signed-off-by: Hua Yan <yanh@lemote.com>
Tested-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Alex Smith <alex.smith@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6636
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
IRQ routing path of Loongson-3:
Devices(most) --> I8259 --> HT Controller --> IRQ Routing Table --> CPU
^
|
Device(legacy devices such as UART) --> Bonito ---|
IRQ Routing Table route 32 INTs to CPU's INT0~INT3(IP2~IP5 of CP0), 32
INTs include 16 HT INTs(mostly), 4 PCI INTs, 1 LPC INT, etc. IP6 is used
for IPI and IP7 is used for internal MIPS timer. LOONGSON_INT_ROUTER_*
are IRQ Routing Table registers.
I8259 IRQs are 1:1 mapped to HT1 INTs. LOONGSON_HT1_* are configuration
registers of HT1 controller.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Hongliang Tao <taohl@lemote.com>
Signed-off-by: Hua Yan <yanh@lemote.com>
Tested-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Alex Smith <alex.smith@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6634
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Loongson family machines use Hyper-Transport bus for inter-core
connection and device connection. The PCI bus is a subordinate
linked at HT1.
With LEFI firmware interface, We don't need fixup for PCI irq routing
(except providing a VBIOS of the integrated GPU).
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Hongliang Tao <taohl@lemote.com>
Signed-off-by: Hua Yan <yanh@lemote.com>
Tested-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Alex Smith <alex.smith@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6633
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The new UEFI-like firmware interface (LEFI, i.e. Loongson Unified
Firmware Interface) has 3 advantages:
1, Firmware export a physical memory map which is similar to X86's
E820 map, so prom_init_memory() will be more elegant that #ifdef
clauses can be removed.
2, Firmware export a pci irq routing table, we no longer need pci
irq routing fixup in kernel's code.
3, Firmware has a built-in vga bios, and its address is exported,
the linux kernel no longer need an embedded blob.
With the LEFI interface, Loongson-3A/2G and all their successors can use
a unified kernel. All Loongson-based machines support this new interface
except 2E/2F series.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Hongliang Tao <taohl@lemote.com>
Signed-off-by: Hua Yan <yanh@lemote.com>
Tested-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Alex Smith <alex.smith@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6632
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add four Loongson-3 based machine types:
MACH_LEMOTE_A1004/MACH_LEMOTE_A1201 are laptops;
MACH_LEMOTE_A1101 is mini-itx;
MACH_LEMOTE_A1205 is all-in-one machine.
The most significant differrent between A1004/A1201 and A1101/A1205 is
the laptops have EC but others don't.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Hongliang Tao <taohl@lemote.com>
Signed-off-by: Hua Yan <yanh@lemote.com>
Tested-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Alex Smith <alex.smith@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6631
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Basic Loongson-3 CPU support include CPU probing and TLB/cache
initializing.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Hongliang Tao <taohl@lemote.com>
Signed-off-by: Hua Yan <yanh@lemote.com>
Tested-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Alex Smith <alex.smith@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6630
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Loongson-3 is a multi-core MIPS family CPU, it support MIPS64R2 fully.
Loongson-3 has the same IMP field (0x6300) as Loongson-2.
Loongson-3 has a hardware-maintained cache, system software doesn't
need to maintain coherency.
Loongson-3A is the first revision of Loongson-3, and it is the quad-
core version of Loongson-2G. Loongson-3A has a simplified version named
Loongson-2Gq, the main difference between Loongson-3A/2Gq is 3A has two
HyperTransport controller but 2Gq has only one. HT0 is used for cross-
chip interconnection and HT1 is used to link PCI bus. Therefore, 2Gq
cannot support NUMA but 3A can. For software, Loongson-2Gq is simply
identified as Loongson-3A.
Exsisting Loongson family CPUs:
Loongson-1: Loongson-1A, Loongson-1B, they are 32-bit MIPS CPUs.
Loongson-2: Loongson-2E, Loongson-2F, Loongson-2G, they are 64-bit
single-core MIPS CPUs.
Loongson-3: Loongson-3A(including so-called Loongson-2Gq), they are
64-bit multi-core MIPS CPUs.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Hongliang Tao <taohl@lemote.com>
Signed-off-by: Hua Yan <yanh@lemote.com>
Tested-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Alex Smith <alex.smith@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6629/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
And there are more CPUs or configurations that want to provide special
per-CPU information in /proc/cpuinfo. So I think there needs to be a
hook mechanism, such as a notifier.
This is a first cut only; I need to think about what sort of looking
the notifier needs to have. But I'd appreciate testing on MT hardware!
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6066/
Loongson-1 is a 32-bit MIPS CPU and Loongson-2/3 are 64-bit MIPS CPUs,
and both Loongson-2/3 has the same PRID IMP filed (0x6300). As a
result, renaming PRID_IMP_LOONGSON1 and PRID_IMP_LOONGSON2 to
PRID_IMP_LOONGSON_32 and PRID_IMP_LOONGSON_64 will make more sense.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Tested-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Alex Smith <alex.smith@imgtec.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: linux-mips@linux-mips.org
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/6552/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The UART register names are identical to the ones in uapi/linux/serial_reg.h,
which causes build failures in various drivers when they indirectly pull in
the au1000.h header, for example via gpio.h:
In file included from arch/mips/include/asm/mach-au1x00/gpio.h:13:0,
from arch/mips/include/asm/gpio.h:4,
from include/linux/gpio.h:48,
from include/linux/ssb/ssb.h:9,
from drivers/ssb/driver_mipscore.c:11:
arch/mips/include/asm/mach-au1x00/au1000.h:1171:0: note: this is the location of the previous definition
#define UART_LSR 0x1C /* Line Status Register */
Get rid of the altogether, nothing in the core Alchemy code depends
on them any more.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Cc: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/6664/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add a few Belkin F7Dxxxx entries, with F7D4401 sourced from online
documentation and the "F7D7302" being observed. F7D3301, F7D3302, and
F7D4302 are reasonable guesses which are unlikely to cause
mis-detection.
Signed-off-by: Cody P Schafer <devel@codyps.com>
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: linux-mips@linux-mips.org
Cc: zajec5@gmail.com
Cc: Cody P Schafer <devel@codyps.com>
Patchwork: https://patchwork.linux-mips.org/patch/6594/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This adds board detection for the Siemens SE505v2 and the led gpio
configuration. This board does not have any buttons.
This is based on OpenWrt broadcom-diag and Manuel Munz's nvram dump.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: linux-mips@linux-mips.org
Cc: zajec5@gmail.com
Patchwork: https://patchwork.linux-mips.org/patch/6593/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The Linksys WRT54G/GS/GL family uses the same boardtype numbers, and
the same gpio configuration. The boardtype numbers are changing with
the hardware versions, but these hardware numbers are different or each
model.
Detect them all as one device, this also worked in OpenWrt.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: linux-mips@linux-mips.org
Cc: zajec5@gmail.com
Patchwork: https://patchwork.linux-mips.org/patch/6591/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The M5150 core is a 32-bit MIPS RISC which implements the
MIPS Architecture Release-5 in a 5-stage pipeline.
In addition, it includes the MIPS Architecture Virtualization Module
that enables virtualization of operating systems,
which provides a scalable, trusted, and secure execution environment.
Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6596/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Merge the db1200.h and db1300.h headers into their only users.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Cc: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/6660/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Setting DMA_MAYBE_COHERENT gives a platform the opportunity to select
use of cache ops at boot.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
Cc: Linux-MIPS <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/6575/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Allow secondary cores to program their segment control registers
during smp bootstrap code. This enables EVA on Malta SMP
configurations
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
The 'ememsize' variable is used to denote the real RAM which is
present on the Malta board. This is different compared to 'memsize'
which is capped to 256MB. The 'ememsize' is used to get the actual
physical memory when setting up the Malta memory layout. This only
makes sense in case the core operates in the EVA mode, and it's
ignored otherwise.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Add a spaces.h file for Malta to override certain memory macros
when operating in EVA mode.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
The Malta board aliases 0x80000000 - 0xffffffff to 0x00000000
- 0x7fffffff ignoring the 256 MB IO hole in 0x10000000.
The physical memory is shifted to 0x80000000 so up to 2GB
can be used. Kuseg is expanded to 3GB (due to board limitations
only 2GB can be accessed) and lowmem (kernel space) is expanded to 2GB.
The Segment Control registers are programmed as follows:
Virtual memory Physical memory Mapping
0x00000000 - 0x7fffffff 0x80000000 - 0xfffffffff MUSUK (kuseg)
0x80000000 - 0x9fffffff 0x00000000 - 0x1ffffffff MUSUK (kseg0)
0xa0000000 - 0xbf000000 0x00000000 - 0x1ffffffff MUSUK (kseg1)
0xc0000000 - 0xdfffffff - MK (kseg2)
0xe0000000 - 0xffffffff - MK (kseg3)
The location of exception vectors remain the same since 0xbfc00000
(traditional exception base) still maps to 0x1fc00000 physical.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
A core in EVA mode can have any possible segment mapping, so the
default free_initmem_default() function may not always work as expected.
Therefore, add a callback that platforms can use to free up the init section.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
The MIPS *Aptiv family uses bit 28 in Config5 CP0 register to
indicate whether the core supports EVA or not.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
This will allow platforms to use an alternative way to get
the physical address of a symbol.
Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Add EVA cache flushing functions similar to non-EVA configurations.
Because the cache may or may not contain user virtual addresses, we
need to use the 'cache' or 'cachee' instruction based on whether we
flush the cache on behalf of kernel or user respectively.
Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>