The hrtimer callback for guest timer timeouts sets the guest's
CP0_Cause.TI bit to indicate to the guest that a timer interrupt is
pending, however there is no mutual exclusion implemented to prevent
this occurring while the guest's CP0_Cause register is being
read-modify-written elsewhere.
When this occurs the setting of the CP0_Cause.TI bit is undone and the
guest misses the timer interrupt and doesn't reprogram the CP0_Compare
register for the next timeout. Currently another timer interrupt will be
triggered again in another 10ms anyway due to the way timers are
emulated, but after the MIPS timer emulation is fixed this would result
in Linux guest time standing still and the guest scheduler not being
invoked until the guest CP0_Count has looped around again, which at
100MHz takes just under 43 seconds.
Currently this is the only asynchronous modification of guest registers,
therefore it is fixed by adjusting the implementations of the
kvm_set_c0_guest_cause(), kvm_clear_c0_guest_cause(), and
kvm_change_c0_guest_cause() macros which are used for modifying the
guest CP0_Cause register to use ll/sc to ensure atomic modification.
This should work in both UP and SMP cases without requiring interrupts
to be disabled.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0
Count and Compare registers. These registers are special in that writing
to them has side effects (adjusting the time until the next timer
interrupt) and reading of Count depends on the time. Therefore add a
couple of callbacks so that different implementations (trap & emulate or
VZ) can implement them differently depending on what the hardware
provides.
The trap & emulate versions mostly duplicate what happens when a T&E
guest reads or writes these registers, so it inherits the same
limitations which can be fixed in later patches.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When MIPS KVM needs to write a TLB entry for the guest it reads the
CP0_Random register, uses it to generate the CP_Index, and writes the
TLB entry using the TLBWI instruction (tlb_write_indexed()).
However there's an instruction for that, TLBWR (tlb_write_random()) so
use that instead.
This happens to also fix an issue with Ingenic XBurst cores where the
same TLB entry is replaced each time preventing forward progress on
stores due to alternating between TLB load misses for the instruction
fetch and TLB store misses.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MIPS KVM uses mips32_SyncICache to synchronise the icache with the
dcache after dynamically modifying guest instructions or writing guest
exception vector. However this uses rdhwr to get the SYNCI step, which
causes a reserved instruction exception on Ingenic XBurst cores.
It would seem to make more sense to use local_flush_icache_range()
instead which does the same thing but is more portable.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Each MIPS KVM guest has its own copy of the KVM exception vector. This
contains the TLB refill exception handler at offset 0x000, the general
exception handler at offset 0x180, and interrupt exception handlers at
offset 0x200 in case Cause_IV=1. A common handler is copied to offset
0x2000 and offset 0x3000 is used for temporarily storing k1 during entry
from guest.
However the amount of memory allocated for this purpose is calculated as
0x200 rounded up to the next page boundary, which is insufficient if 4KB
pages are in use. This can lead to the common handler at offset 0x2000
being overwritten and infinitely recursive exceptions on the next exit
from the guest.
Increase the minimum size from 0x200 to 0x4000 to cover the full use of
the page.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sometimes it's useful to let the user, while doing performance research,
know what in the IEEE754 exceptions has caused many times of FP emulation
when running a specific application. This patch adds 5 more files to
/sys/kernel/debug/mips/fpuemustats/, whose filenames begin with "ieee754".
These stats are in addition to the existing cp1ops, cp1xops, errors, loads
and stores, which may not be useful in understanding the reasons of ieee754
exceptions.
[ralf@linux-mips.org: Fixed reject due to other changes to the kernel
FP assist software.]
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Steven.Hill@imgtec.com
Cc: james.hogan@imgtec.com
Patchwork: http://patchwork.linux-mips.org/patch/7044/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add interface mode detection for Octeon II. This is necessary to detect
the interface modes correctly on the UBNT E200 board. Code is taken
from the UBNT GPL source release, with some alterations: SRIO, ILK and
RXAUI interface modes are removed and instead return disabled as these
modes are not currently supported.
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Tested-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7039/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The CONFIG_MIPS_CPS SMP implementation should be able to handle all
cases the CONFIG_MIPS_CMP implementation does, but without requiring
bootloader assistance. It is also required in order to make use of
features such as hotplug & cpuidle core power gating. Enable it by
default for Malta configs that previously enabled the now deprecated
CONFIG_MIPS_CMP, and disable the latter. The local version suffix "cmp"
is removed rather than replaced with "cps" since there are other ways to
tell that the CPS SMP implementation is in use (the "VPE topology" line
in the boot log being one).
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
This patch adds a cpuidle driver for systems based around the MIPS
Coherent Processing System (CPS) architecture. It supports four idle
states:
- The standard MIPS wait instruction.
- The non-coherent wait, clock gated & power gated states exposed by
the recently added pm-cps layer.
The pm-cps layer is used to enter all the deep idle states. Since cores
in the clock or power gated states cannot service interrupts, the
gic_send_ipi_single function is modified to send a power up command for
the appropriate core to the CPC in cases where the target CPU has marked
itself potentially incoherent.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
This patch simply includes the cpuidle Kconfig entries in preparation
for cpuidle drivers used on MIPS systems.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Defines a macro intended to allow trivial use of the regular MIPS wait
instruction from cpuidle drivers, which may simply invoke the macro
within their array of states.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Rather than hardcoding CCA=0x5 for secondary cores, re-use the CCA from
the boot CPU. This allows overrides of the CCA using the cca= kernel
parameter to take effect on all CPUs for consistency.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
This patch sets a default CCA suited for use with multi-core SMP on all
current MIPS CPS based systems. It may still be overriden by the cca=
argument on the kernel command line.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
If the user or bootloader sets the CCA to a value which is not suited
for multi-core SMP (ie. anything non-coherent) then limit the system to
using only a single core and warn the user.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
This patch adds support for offlining CPUs via hotplug when using the
CONFIG_MIPS_CPS SMP implementation. When a CPU is offlined one of 2
things will happen:
- If the CPU is part of a core which implements the MT ASE and there
is at least one other VPE online within that core then the VPE will
be halted by settings its TCHalt bit.
- Otherwise if supported the core will be powered down via the CPC.
- Otherwise the CPU will hang by executing an infinite loop.
Bringing CPUs back online is then a process of either clearing the
appropriate VPEs TCHalt bit or powering up the appropriate core via the
CPC. Throughout the process the struct core_boot_config vpe_mask field
must be maintained such that mips_cps_boot_vpes will start & stop the
correct VPEs.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
This patch adds code to generate entry & exit code for various low power
states available on systems based around the MIPS Coherent Processing
System architecture (ie. those with a Coherence Manager, Global
Interrupt Controller & for >=CM2 a Cluster Power Controller). States
supported are:
- Non-coherent wait. This state first leaves the coherent domain and
then executes a regular MIPS wait instruction. Power savings are
found from the elimination of coherency interventions between the
core and any other coherent requestors in the system.
- Clock gated. This state leaves the coherent domain and then gates
the clock input to the core. This removes all dynamic power from the
core but leaves the core at the mercy of another to restart its
clock. Register state is preserved, but the core can not service
interrupts whilst its clock is gated.
- Power gated. This deepest state removes all power input to the core.
All register state is lost and the core will restart execution from
its BEV when another core powers it back up. Because register state
is lost this state requires cooperation with the CONFIG_MIPS_CPS SMP
implementation in order for the core to exit the state successfully.
The code will detect which states are available on the current system
during boot & generate the entry/exit code for those states. This will
be used by cpuidle & hotplug implementations.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
The core which the CPC core-other region relates to is based upon the
core-local core-other addressing register. As its name suggests this
register is shared between all VPEs within a core, and if there is a
possibility that multiple VPEs within a core will attempt to access
another core simultaneously then locking is required. This wasn't
previously a problem with the only user being cpu0 during boot, but will
be an issue once hotplug is implemented & may race with other users such
as cpuidle.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
The start of mips_cps_core_entry is patched in order to provide the code
with the address of the CM register region at a point where it will be
running non-coherent with the rest of the system. However the cache
wasn't being flushed after that patching which could in principle lead
to secondary cores using an invalid CM base address.
The patching is moved to cps_prepare_cpus since local_flush_icache_range
has not been initialised at the point cps_smp_setup is called.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
The core power down state for cpuidle will require that the CPS SMP
implementation is in use. This patch provides a mips_cps_smp_in_use
function which determines whether or not the CPS SMP implementation is
currently in use.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
When hotplug and/or a powered down idle state are supported cases will
arise where a non-zero VPE must be brought online without VPE 0, and it
where multiple VPEs must be onlined simultaneously. This patch prepares
for that by:
- Splitting struct boot_config into core & VPE boot config structures,
allocated one per core or VPE respectively. This allows for multiple
VPEs to be onlined simultaneously without clobbering each others
configuration.
- Indicating which VPEs should be online within a core at any given
time using a bitmap. This allows multiple VPEs to be brought online
simultaneously and also indicates to VPE 0 whether it should halt
after starting any non-zero VPEs that should be online within the
core. For example if all VPEs within a core are offlined via hotplug
and the user onlines the second VPE within that core:
1) The core will be powered up.
2) VPE 0 will run from the BEV (ie. mips_cps_core_entry) to
initialise the core.
3) VPE 0 will start VPE 1 because its bit is set in the cores
bitmap.
4) VPE 0 will halt itself because its bit is clear in the cores
bitmap.
- Moving the core & VPE initialisation to assembly code which does not
make any use of the stack. This is because if a non-zero VPE is to
be brought online in a powered down core then when VPE 0 of that
core runs it may not have a valid stack, and even if it did then
it's messy to run through parts of generic kernel code on VPE 0
before starting the correct VPE.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
This patch allows use of the MT ASE yield instruction from uasm. It will
be used by a subsequent patch.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
This patch allows for use of the beq instruction with labels from uasm,
much as bne & others already do. It will be used by a subsequent patch.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
The opcode for the wait instruction within POOL32AXf was missing. This
patch adds it for use by a subsequent patch.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
The opcode for the sync instruction within POOL32AXf was missing. This
patch adds it for use by a subsequent patch.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
The opcode for the MT ASE yield instruction within the spec3 group was
missing. This patch adds it for use by a subsequent patch.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Define a macro to write to the current TCs TCHalt register. This will be
used by a subsequent patch.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
This is identical to kmap_coherent apart from the cache coherency
attribute used for the TLB entry, so kmap_coherent is abstracted to
kmap_prot which is then called for both kmap_coherent &
kmap_noncoherent. This will be used by a subsequent patch.
Suggested-by: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
ptrace_{get,set}_watch_regs access current_cpu_data to get the watch
register count/masks, which calls smp_processor_id(). However they are
run in preemptible context and therefore trigger warnings like so:
[ 6340.092000] BUG: using smp_processor_id() in preemptible [00000000] code: gdb/367
[ 6340.092000] caller is ptrace_get_watch_regs+0x44/0x220
Since the watch register count/masks should be the same across all
CPUs, use boot_cpu_data instead. Note that this may need to change in
future should a heterogenous system be supported where the count/masks
are not the same across all CPUs (the current code is also incorrect
for this scenario - current_cpu_data here would not necessarily be
correct for the CPU that the target task will execute on).
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Reviewed-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/6879/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Allow 64-bit userspace programs to use ll64 types. The define name
comes from commit 2c9c6ce019 (powerpc:
Add __SANE_USERSPACE_TYPES__ to asm/types.h for LL64).
The patch allows to compile perf on MIPS64 and eliminates the following
warnings:
tests/attr.c:74:4: error: format '%llu' expects argument of type 'long
long unsigned int', but argument 6 has type '__u64' [-Werror=format=]
Signed-off-by: Aaro Koskinen <aaro.koskinen@nsn.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/6890/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>