This patch fix the offset of CPU boot address and don't
need to send smc call of SMC_CMD_CPU1BOOT command for
secondary CPU boot because Exynos3250 removes WFE in
secure mode.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
This patch add Exynos3250's SoC ID. Exynos 3250 is SoC that
is based on the 32-bit RISC processor for Smartphone.
Exynos3250 uses Cortex-A7 dual cores and has a target speed
of 1.0GHz.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Exynos5800 is an octa core SoC which is based on the 5420
platform. This patch adds the basic support for it in the
mach-exynos.
Signed-off-by: Arun Kumar K <arun.kk@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
This patch add basic arch side support for exynos5260 SoC.
Note that this is required to enable build for clock driver.
Signed-off-by: Pankaj Dubey <pankaj.dubey@samsung.com>
Signed-off-by: Rahul Sharma <rahul.sharma@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
This patch add pmusysreg node for Exynos3250 to access PMU
(Power Management Unit) register in a centralized way using
syscon driver.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
The vbus-supply property is wrongly updated in the
usbdrd node instead of the usbdrd_phy node. This patch
fixes the same.
Signed-off-by: Arun Kumar K <arun.kk@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
The vbus-supply property is wrongly updated in the
usbdrd node instead of the usbdrd_phy node. This patch
fixes the same.
Signed-off-by: Arun Kumar K <arun.kk@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
This patch adds new exynos3250.dtsi to support Exynos3250 SoC
based on Cortex-A7 dual core and includes following dt nodes:
- GIC interrupt controller
- Pinctrl to control GPIOs
- Clock controller
- CPU information (Cortex-A7 dual core)
- UART to support serial port
- MCT (Multi Core Timer)
- ADC (Analog Digital Converter)
- I2C/SPI bus
- Power domain
- PMU (Performance Monitoring Unit)
- MSHC (Mobile Storage Host Controller)
- PWM (Pluse Width Modulation)
- AMBA bus
- sysram node for SYSRAM memory mapping
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Hyunhee Kim <hyunhee.kim@samsung.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Cc: Ben Dooks <ben-linux@fluff.org>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ian Campbell <ijc+devicetree@hellion.org.uk>
Cc: Kumar Gala <galak@codeaurora.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: devicetree@vger.kernel.org
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Add required fixed-regulator for VBUS supply for USB 3.0
controller phy.
Signed-off-by: Vivek Gautam <gautam.vivek@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Enable display controller with timing information for 1080p
panel in Exynos5800 peach-pi board.
Signed-off-by: Rahul Sharma <rahul.sharma@samsung.com>
Reviewed-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Adds support for google peach-pi board having the
Exynos5800 SoC.
Signed-off-by: Arun Kumar K <arun.kk@samsung.com>
Signed-off-by: Doug Anderson <dianders@chromium.org>
Reviewed-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Most of the nodes of exynos5420 remains same for exynos5800.
So the exynos5420.dtsi is included in exynos5800 and the changed
node properties will be overriden.
Signed-off-by: Arun Kumar K <arun.kk@samsung.com>
Reviewed-by: Tomasz Figa <t.figa@samsung.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
The patch adds the dts file for xyref5260 board which
is based on exynos5260 SoC.
Signed-off-by: Rahul Sharma <rahul.sharma@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Made it as per DT node naming convention <name@reg_addr>.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Reviewed-by: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Key code macros improve readability on exnos4210-origen,
exynos4412-origen and exynos5250-arndale boards.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
[kgene.kim@samsung.com: squashed similar two patches]
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Enabled RTC and WDT nodes on exynos4210-origen and
exynos4412-origen boards.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
[kgene.kim@samsung.com: squashed similar two patches]
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Call pcie_bus_configure_settings() on ARM, like for other platforms.
pcie_bus_configure_settings() makes sure the MPS across the bus is uniform
and provides the ability to tune the MRSS and MPS to higher performance
values. This is particularly important for embedded where there is no
firmware to program these PCIe settings for the OS.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
CC: Santosh Shilimkar <santosh.shilimkar@ti.com>
On platforms implementing CPU power management, the CPUidle subsystem
can allow CPUs to enter idle states where local timers logic is lost on power
down. To keep the software timers functional the kernel relies on an
always-on broadcast timer to be present in the platform to relay the
interrupt signalling the timer expiries.
For platforms implementing CPU core gating that do not implement an always-on
HW timer or implement it in a broken way, this patch adds code to initialize
the kernel hrtimer based clock event device upon boot (which can be chosen as
tick broadcast device by the kernel).
It relies on a dynamically chosen CPU to be always powered-up. This CPU then
relays the timer interrupt to CPUs in deep-idle states through its HW local
timer device.
Having a CPU always-on has implications on power management platform
capabilities and makes CPUidle suboptimal, since at least a CPU is kept
always in a shallow idle state by the kernel to relay timer interrupts,
but at least leaves the kernel with a functional system with some working
power management capabilities.
The hrtimer based clock event device is unconditionally registered, but
has the lowest possible rating such that any broadcast-capable HW clock
event device present will be chosen in preference as the tick broadcast
device.
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This corrects a bug that will be introduced in v3.15.
The bug causes audio playback to fail on the Armadillo800 EVA board.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIcBAABAgAGBQJTh8LKAAoJENfPZGlqN0++qVEP/3dzAzOCnEzjw1zHgnl8OdZI
VWaAPd9mfyhNPk364wt09hy4G4mk99Oz3XwfQJ7mifR9BgT8msIm/oPjVFvZSrdO
ShZvn3o7Rdd43ZTz0QHyKoFbB9/kxQxGasvNDe5Bm5ckIAyqCqA8leWLzRNJ1Rdh
sb7jUgYrgJjQlTy/y88L/vq9GaJzVy1fsVo9VUWwSxoxqiyLKPRvlMxTqucaW1gM
NIUaa1bPzXiC00pQNSMytgSVSCUTbret8yifqTa4VGZH2E7iK5eGjqwe75JqFXzs
Ovrm7lfs8C1/FvMWNYP4qliZku/J2zacAUTKnuEv0g0s0NsJGk4WYTKh2vzr6N+N
XjLblrZ3yz8R3jorPVXBNYCZd7KoZi/9/byhduPyXtZkg4+QSws8vrtRdyX/8GXJ
JM9LsQLTBIXS8nBuxe175BttCZUS8S199CbPaFyZl1KcV+GKIYn9KgcTZEJnZkqq
ZMXeuSlHxYk17zmClO2V/PDqTs9bF7jiK14NpjYczGSvcAaPoGXMHLzem1ob0YUK
sFS4IqCM/Xrw3XDs3mD5pF7SgwdopJEOnQNW1Aqw4P1ixiZWJVb+y48OKHw4SVzn
35cCl/A22VydwwbggU0iEfBoME740jiWXv7G8t4GrxYZ08O/v8pAokLo63jAYhbi
k0pp/Ubl3up/TKw3Vp4E
=aVCn
-----END PGP SIGNATURE-----
Merge tag 'renesas-fixes-for-v3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into next/fixes-non-critical
Merge "Renesas ARM Based SoC Fixes for v3.16" from Simon Horman:
This corrects a bug that will be introduced in v3.15.
The bug causes audio playback to fail on the Armadillo800 EVA board.
* tag 'renesas-fixes-for-v3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: armadillo800eva: fixup HDMI sound flags setting
Signed-off-by: Olof Johansson <olof@lixom.net>
In this round we have a few nice gems. PR KVM gains initial POWER8 support
as well as LE host awareness, ihe e500 targets can now properly run u-boot,
LE guests now work with PR KVM including KVM hypercalls and HV KVM guests
can now use huge pages.
On top of this there are some bug fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQIcBAABAgAGBQJTiHxMAAoJECszeR4D/txgGggQAL5rYpFkuU8+KiLd7uX4DasT
DYrKxgOX/G8WRI0OJZpZHGdcw4fefBxHjY2ydnCSDk+EH2IPrlF6+aYnCwoarj3I
L2Ji/OEfFZVgQtqNkqbuG5Z8onmrgZ86y5TFZKiMcDZe3zLnuGIP4fCLWEeWalor
JQ9fE0z4WFkCtUD18/S/Cf6bOniGLdphUrmgECo+4SbSxUbixrckBEwuA7TjWure
VrA0UX35l+VoVtYntKrQqYzsf+QSNWPzcDQnhlso+rz5ap6sB6rlC0PO8SHTPjk4
BXvHVvripVss7mGv7vZPizp52WXAFsNnVPj1tNSCu3Tn+gRBB8ChCV8xHIWYex8a
nz8AvdUq1TdWmSy7NwJODjJfclauTVtrhPvrbLNSN3ksSzx4eLh8i+Lf8qe0SW7i
4oTfxG+PSDlF/ub8KX8BVfM0FpyvHdcR1h1ex3szPAQfiaAy+/ce9c3bB0xzXmLx
sN3GX0BocdWejMo1U8+/+CVUngtol81BAb+k4GB5mPjW08u5jFVZu+XwxhK2ACx3
y3hAuaGd1d0YuTztYav7+wuu5gPwhYSA3/tmTR1+0jx9uCQbPpOyPJmprt/9A2DP
TY0T0I3TUJqWZAtHOsY0P0hCGl+gZQJb8KhjVKHC6t6/ab3grMkpHgp9XDVvKpHJ
u9lPDCuQRyJBSy9WI2Qa
=t4YN
-----END PGP SIGNATURE-----
Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into kvm-next
Patch queue for ppc - 2014-05-30
In this round we have a few nice gems. PR KVM gains initial POWER8 support
as well as LE host awareness, ihe e500 targets can now properly run u-boot,
LE guests now work with PR KVM including KVM hypercalls and HV KVM guests
can now use huge pages.
On top of this there are some bug fixes.
Conflicts:
include/uapi/linux/kvm.h
On LPAR guest systems Linux enables the shadow SLB to indicate to the
hypervisor a number of SLB entries that always have to be available.
Today we go through this shadow SLB and disable all ESID's valid bits.
However, pHyp doesn't like this approach very much and honors us with
fancy machine checks.
Fortunately the shadow SLB descriptor also has an entry that indicates
the number of valid entries following. During the lifetime of a guest
we can just swap that value to 0 and don't have to worry about the
SLB restoration magic.
While we're touching the code, let's also make it more readable (get
rid of rldicl), allow it to deal with a dynamic number of bolted
SLB entries and only do shadow SLB swizzling on LPAR systems.
Signed-off-by: Alexander Graf <agraf@suse.de>
We didn't make use of SLB entry 0 because ... of no good reason. SLB entry 0
will always be used by the Linux linear SLB entry, so the fact that slbia
does not invalidate it doesn't matter as we overwrite SLB 0 on exit anyway.
Just enable use of SLB entry 0 for our shadow SLB code.
Signed-off-by: Alexander Graf <agraf@suse.de>
The code that delivered a machine check to the guest after handling
it in real mode failed to load up r11 before calling kvmppc_msr_interrupt,
which needs the old MSR value in r11 so it can see the transactional
state there. This adds the missing load.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds workarounds for two hardware bugs in the POWER8 performance
monitor unit (PMU), both related to interrupt generation. The effect
of these bugs is that PMU interrupts can get lost, leading to tools
such as perf reporting fewer counts and samples than they should.
The first bug relates to the PMAO (perf. mon. alert occurred) bit in
MMCR0; setting it should cause an interrupt, but doesn't. The other
bug relates to the PMAE (perf. mon. alert enable) bit in MMCR0.
Setting PMAE when a counter is negative and counter negative
conditions are enabled to cause alerts should cause an alert, but
doesn't.
The workaround for the first bug is to create conditions where a
counter will overflow, whenever we are about to restore a MMCR0
value that has PMAO set (and PMAO_SYNC clear). The workaround for
the second bug is to freeze all counters using MMCR2 before reading
MMCR0.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Current, when testing whether a page is dirty (when constructing the
bitmap for the KVM_GET_DIRTY_LOG ioctl), we test the C (changed) bit
in the HPT entries mapping the page, and if it is 0, we consider the
page to be clean. However, the Power ISA doesn't require processors
to set the C bit to 1 immediately when writing to a page, and in fact
allows them to delay the writeback of the C bit until they receive a
TLB invalidation for the page. Thus it is possible that the page
could be dirty and we miss it.
Now, if there are vcpus running, this is not serious since the
collection of the dirty log is racy already - some vcpu could dirty
the page just after we check it. But if there are no vcpus running we
should return definitive results, in case we are in the final phase of
migrating the guest.
Also, if the permission bits in the HPTE don't allow writing, then we
know that no CPU can set C. If the HPTE was previously writable and
the page was modified, any C bit writeback would have been flushed out
by the tlbie that we did when changing the HPTE to read-only.
Otherwise we need to do a TLB invalidation even if the C bit is 0, and
then check the C bit.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
The dirty map that we construct for the KVM_GET_DIRTY_LOG ioctl has
one bit per system page (4K/64K). Currently, we only set one bit in
the map for each HPT entry with the Change bit set, even if the HPT is
for a large page (e.g., 16MB). Userspace then considers only the
first system page dirty, though in fact the guest may have modified
anywhere in the large page.
To fix this, we make kvm_test_clear_dirty() return the actual number
of pages that are dirty (and rename it to kvm_test_clear_dirty_npages()
to emphasize that that's what it returns). In kvmppc_hv_get_dirty_log()
we then set that many bits in the dirty map.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently, when a huge page is faulted in for a guest, we select the
rmap chain to insert the HPTE into based on the guest physical address
that the guest tried to access. Since there is an rmap chain for each
system page, there are many rmap chains for the area covered by a huge
page (e.g. 256 for 16MB pages when PAGE_SIZE = 64kB), and the huge-page
HPTE could end up in any one of them.
For consistency, and to make the huge-page HPTEs easier to find, we now
put huge-page HPTEs in the rmap chain corresponding to the base address
of the huge page.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
The global_invalidates() function contains a check that is intended
to tell whether we are currently executing in the context of a hypercall
issued by the guest. The reason is that the optimization of using a
local TLB invalidate instruction is only valid in that context. The
check was testing local_paca->kvm_hstate.kvm_vcore, which gets set
when entering the guest but no longer gets cleared when exiting the
guest. To fix this, we use the kvm_vcpu field instead, which does
get cleared when exiting the guest, by the kvmppc_release_hwthread()
calls inside kvmppc_run_core().
The effect of having the check wrong was that when kvmppc_do_h_remove()
got called from htab_write() on the destination machine during a
migration, it cleared the current cpu's bit in kvm->arch.need_tlb_flush.
This meant that when the guest started running in the destination VM,
it may miss out on doing a complete TLB flush, and therefore may end
up using stale TLB entries from a previous guest that used the same
LPID value.
This should make migration more reliable.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Commit b005255e12 ("KVM: PPC: Book3S HV: Context-switch new POWER8
SPRs") added a definition of KVM_REG_PPC_WORT with the same register
number as the existing KVM_REG_PPC_VRSAVE (though in fact the
definitions are not identical because of the different register sizes.)
For clarity, this moves KVM_REG_PPC_WORT to the next unused number,
and also adds it to api.txt.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
We worked around some nasty KVM magic page hcall breakages:
1) NX bit not honored, so ignore NX when we detect it
2) LE guests swizzle hypercall instruction
Without these fixes in place, there's no way it would make sense to expose kvm
hypercalls to a guest. Chances are immensely high it would trip over and break.
So add a new CAP that gives user space a hint that we have workarounds for the
bugs above in place. It can use those as hint to disable PV hypercalls when
the guest CPU is anything POWER7 or higher and the host does not have fixes
in place.
Signed-off-by: Alexander Graf <agraf@suse.de>
When we reset the in-kernel MPIC controller, we forget to reset some hidden
state such as destmask and output. This state is usually set when the guest
writes to the IDR register for a specific IRQ line.
To make sure we stay in sync and don't forget hidden state, treat reset of
the IDR register as a simple write of the IDR register. That automatically
updates all the hidden state as well.
Reported-by: Paul Janzen <pcj@pauljanzen.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
There are LE Linux guests out there that don't handle hypercalls correctly.
Instead of interpreting the instruction stream from device tree as big endian
they assume it's a little endian instruction stream and fail.
When we see an illegal instruction from such a byte reversed instruction stream,
bail out graciously and just declare every hcall as error.
Signed-off-by: Alexander Graf <agraf@suse.de>
We get an array of instructions from the hypervisor via device tree that
we write into a buffer that gets executed whenever we want to make an
ePAPR compliant hypercall.
However, the hypervisor passes us these instructions in BE order which
we have to manually convert to LE when we want to run them in LE mode.
With this fixup in place, I can successfully run LE kernels with KVM
PV enabled on PR KVM.
Signed-off-by: Alexander Graf <agraf@suse.de>
Use make_dsisr instead of open coding it. This also have
the added benefit of handling alignment interrupt on additional
instructions.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Although it's optional, IBM POWER cpus always had DAR value set on
alignment interrupt. So don't try to compute these values.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Because old kernels enable the magic page and then choke on NXed trampoline
code we have to disable NX by default in KVM when we use the magic page.
However, since commit b18db0b8 we have successfully fixed that and can now
leave NX enabled, so tell the hypervisor about this.
Signed-off-by: Alexander Graf <agraf@suse.de>
Old guests try to use the magic page, but map their trampoline code inside
of an NX region.
Since we can't fix those old kernels, try to detect whether the guest is sane
or not. If not, just disable NX functionality in KVM so that old guests at
least work at all. For newer guests, add a bit that we can set to keep NX
functionality available.
Signed-off-by: Alexander Graf <agraf@suse.de>
On recent IBM Power CPUs, while the hashed page table is looked up using
the page size from the segmentation hardware (i.e. the SLB), it is
possible to have the HPT entry indicate a larger page size. Thus for
example it is possible to put a 16MB page in a 64kB segment, but since
the hash lookup is done using a 64kB page size, it may be necessary to
put multiple entries in the HPT for a single 16MB page. This
capability is called mixed page-size segment (MPSS). With MPSS,
there are two relevant page sizes: the base page size, which is the
size used in searching the HPT, and the actual page size, which is the
size indicated in the HPT entry. [ Note that the actual page size is
always >= base page size ].
We use "ibm,segment-page-sizes" device tree node to advertise
the MPSS support to PAPR guest. The penc encoding indicates whether
we support a specific combination of base page size and actual
page size in the same segment. We also use the penc value in the
LP encoding of HPTE entry.
This patch exposes MPSS support to KVM guest by advertising the
feature via "ibm,segment-page-sizes". It also adds the necessary changes
to decode the base page size and the actual page size correctly from the
HPTE entry.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Today when KVM tries to reserve memory for the hash page table it
allocates from the normal page allocator first. If that fails it
falls back to CMA's reserved region. One of the side effects of
this is that we could end up exhausting the page allocator and
get linux into OOM conditions while we still have plenty of space
available in CMA.
This patch addresses this issue by first trying hash page table
allocation from CMA's reserved region before falling back to the normal
page allocator. So if we run out of memory, we really are out of memory.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
POWER8 introduces transactional memory which brings along a number of new
registers and MSR bits.
Implementing all of those is a pretty big headache, so for now let's at least
emulate enough to make Linux's context switching code happy.
Signed-off-by: Alexander Graf <agraf@suse.de>
POWER8 introduces a new facility called the "Event Based Branch" facility.
It contains of a few registers that indicate where a guest should branch to
when a defined event occurs and it's in PR mode.
We don't want to really enable EBB as it will create a big mess with !PR guest
mode while hardware is in PR and we don't really emulate the PMU anyway.
So instead, let's just leave it at emulation of all its registers.
Signed-off-by: Alexander Graf <agraf@suse.de>
POWER8 implements a new register called TAR. This register has to be
enabled in FSCR and then from KVM's point of view is mere storage.
This patch enables the guest to use TAR.
Signed-off-by: Alexander Graf <agraf@suse.de>
POWER8 introduced a new interrupt type called "Facility unavailable interrupt"
which contains its status message in a new register called FSCR.
Handle these exits and try to emulate instructions for unhandled facilities.
Follow-on patches enable KVM to expose specific facilities into the guest.
Signed-off-by: Alexander Graf <agraf@suse.de>
In parallel to the Processor ID Register (PIR) threaded POWER8 also adds a
Thread ID Register (TIR). Since PR KVM doesn't emulate more than one thread
per core, we can just always expose 0 here.
Signed-off-by: Alexander Graf <agraf@suse.de>
When we expose a POWER8 CPU into the guest, it will start accessing PMU SPRs
that we don't emulate. Just ignore accesses to them.
Signed-off-by: Alexander Graf <agraf@suse.de>
With the previous patches applied, we can now successfully use PR KVM on
little endian hosts which means we can now allow users to select it.
However, HV KVM still needs some work, so let's keep the kconfig conflict
on that one.
Signed-off-by: Alexander Graf <agraf@suse.de>
When the host CPU we're running on doesn't support dcbz32 itself, but the
guest wants to have dcbz only clear 32 bytes of data, we loop through every
executable mapped page to search for dcbz instructions and patch them with
a special privileged instruction that we emulate as dcbz32.
The only guests that want to see dcbz act as 32byte are book3s_32 guests, so
we don't have to worry about little endian instruction ordering. So let's
just always search for big endian dcbz instructions, also when we're on a
little endian host.
Signed-off-by: Alexander Graf <agraf@suse.de>
The shared (magic) page is a data structure that contains often used
supervisor privileged SPRs accessible via memory to the user to reduce
the number of exits we have to take to read/write them.
When we actually share this structure with the guest we have to maintain
it in guest endianness, because some of the patch tricks only work with
native endian load/store operations.
Since we only share the structure with either host or guest in little
endian on book3s_64 pr mode, we don't have to worry about booke or book3s hv.
For booke, the shared struct stays big endian. For book3s_64 hv we maintain
the struct in host native endian, since it never gets shared with the guest.
For book3s_64 pr we introduce a variable that tells us which endianness the
shared struct is in and route every access to it through helper inline
functions that evaluate this variable.
Signed-off-by: Alexander Graf <agraf@suse.de>
We expose a blob of hypercall instructions to user space that it gives to
the guest via device tree again. That blob should contain a stream of
instructions necessary to do a hypercall in big endian, as it just gets
passed into the guest and old guests use them straight away.
Signed-off-by: Alexander Graf <agraf@suse.de>
When the guest does an RTAS hypercall it keeps all RTAS variables inside a
big endian data structure.
To make sure we don't have to bother about endianness inside the actual RTAS
handlers, let's just convert the whole structure to host endian before we
call our RTAS handlers and back to big endian when we return to the guest.
Signed-off-by: Alexander Graf <agraf@suse.de>
The HTAB on PPC is always in big endian. When we access it via hypercalls
on behalf of the guest and we're running on a little endian host, we need
to make sure we swap the bits accordingly.
Signed-off-by: Alexander Graf <agraf@suse.de>
The default MSR when user space does not define anything should be identical
on little and big endian hosts, so remove MSR_LE from it.
Signed-off-by: Alexander Graf <agraf@suse.de>
The "shadow SLB" in the PACA is shared with the hypervisor, so it has to
be big endian. We access the shadow SLB during world switch, so let's make
sure we access it in big endian even when we're on a little endian host.
Signed-off-by: Alexander Graf <agraf@suse.de>
The HTAB is always big endian. We access the guest's HTAB using
copy_from/to_user, but don't yet take care of the fact that we might
be running on an LE host.
Wrap all accesses to the guest HTAB with big endian accessors.
Signed-off-by: Alexander Graf <agraf@suse.de>
The HTAB is always big endian. We access the guest's HTAB using
copy_from/to_user, but don't yet take care of the fact that we might
be running on an LE host.
Wrap all accesses to the guest HTAB with big endian accessors.
Signed-off-by: Alexander Graf <agraf@suse.de>
Commit 9308ab8e2d made C/R HTAB updates go byte-wise into the target HTAB.
However, it didn't update the guest's copy of the HTAB, but instead the
host local copy of it.
Write to the guest's HTAB instead.
Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Paul Mackerras <paulus@samba.org>
Acked-by: Paul Mackerras <paulus@samba.org>
This patch make sure we inherit the LE bit correctly in different case
so that we can run Little Endian distro in PR mode
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The dcbtls instruction is able to lock data inside the L1 cache.
We don't want to give the guest actual access to hardware cache locks,
as that could influence other VMs on the same system. But we can tell
the guest that its locking attempt failed.
By implementing the instruction we at least don't give the guest a
program exception which it definitely does not expect.
Signed-off-by: Alexander Graf <agraf@suse.de>
The L1 instruction cache control register contains bits that indicate
that we're still handling a request. Mask those out when we set the SPR
so that a read doesn't assume we're still doing something.
Signed-off-by: Alexander Graf <agraf@suse.de>
The kfree() function already NULL checks the parameter so remove the
redundant NULL checks before kfree() calls in arch/mips/kvm/.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The logging from MIPS KVM is fairly noisy with kvm_info() in places
where it shouldn't be, such as on VM creation and migration to a
different CPU. Replace these kvm_info() calls with kvm_debug().
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_debug() uses pr_debug() which is already compiled out in the absence
of a DEBUG define, so remove the unnecessary ifdef DEBUG lines around
kvm_debug() calls which are littered around arch/mips/kvm/.
As well as generally cleaning up, this prevents future bit-rot due to
DEBUG not being commonly used.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix build errors when DEBUG is defined in arch/mips/kvm/.
- The DEBUG code in kvm_mips_handle_tlbmod() was missing some variables.
- The DEBUG code in kvm_mips_host_tlb_write() was conditional on an
undefined "debug" variable.
- The DEBUG code in kvm_mips_host_tlb_inv() accessed asid_map directly
rather than using kvm_mips_get_user_asid(). Also fixed brace
placement.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The kvm_mips_comparecount_func() and kvm_mips_comparecount_wakeup()
functions are only used within arch/mips/kvm/kvm_mips.c, so make them
static.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose the KVM guest CP0_Count frequency to userland via a new
KVM_REG_MIPS_COUNT_HZ register accessible with the KVM_{GET,SET}_ONE_REG
ioctls.
When the frequency is altered the bias is adjusted such that the guest
CP0_Count doesn't jump discontinuously or lose any timer interrupts.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Expose two new virtual registers to userland via the
KVM_{GET,SET}_ONE_REG ioctls.
KVM_REG_MIPS_COUNT_CTL is for timer configuration fields and just
contains a master disable count bit. This can be used by userland to
freeze the timer in order to read a consistent state from the timer
count value and timer interrupt pending bit. This cannot be done with
the CP0_Cause.DC bit because the timer interrupt pending bit (TI) is
also in CP0_Cause so it would be impossible to stop the timer without
also risking a race with an hrtimer interrupt and having to explicitly
check whether an interrupt should have occurred.
When the timer is re-enabled it resumes without losing time, i.e. the
CP0_Count value jumps to what it would have been had the timer not been
disabled, which would also be impossible to do from userland with
CP0_Cause.DC. The timer interrupt also cannot be lost, i.e. if a timer
interrupt would have occurred had the timer not been disabled it is
queued when the timer is re-enabled.
This works by storing the nanosecond monotonic time when the master
disable is set, and using it for various operations instead of the
current monotonic time (e.g. when recalculating the bias when the
CP0_Count is set), until the master disable is cleared again, i.e. the
timer state is read/written as it would have been at that time. This
state is exposed to userland via the read-only KVM_REG_MIPS_COUNT_RESUME
virtual register so that userland can determine the exact time the
master disable took effect.
This should allow userland to atomically save the state of the timer,
and later restore it.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The KVM_HOST_FREQ Kconfig symbol was used by KVM guest kernels to
override the timer frequency calculation to a value based on the host
frequency. Now that the KVM timer emulation is implemented independent
of the host timer frequency and defaults to 100MHz, adjust the working
of CONFIG_KVM_HOST_FREQ to match.
The Kconfig symbol now specifies the guest timer frequency directly, and
has been renamed accordingly to KVM_GUEST_TIMER_FREQ. It now defaults to
100MHz too and the help text is updated to make it clear that a zero
value will allow the normal timer frequency calculation to take place
(based on the emulated RTC).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Previously the emulation of the CPU timer was just enough to get a Linux
guest running but some shortcuts were taken:
- The guest timer interrupt was hard coded to always happen every 10 ms
rather than being timed to when CP0_Count would match CP0_Compare.
- The guest's CP0_Count register was based on the host's CP0_Count
register. This isn't very portable and fails on cores without a
CP_Count register implemented such as Ingenic XBurst. It also meant
that the guest's CP0_Cause.DC bit to disable the CP0_Count register
took no effect.
- The guest's CP0_Count register was emulated by just dividing the
host's CP0_Count register by 4. This resulted in continuity problems
when used as a clock source, since when the host CP0_Count overflows
from 0x7fffffff to 0x80000000, the guest CP0_Count transitions
discontinuously from 0x1fffffff to 0xe0000000.
Therefore rewrite & fix emulation of the guest timer based on the
monotonic kernel time (i.e. ktime_get()). Internally a 32-bit count_bias
value is added to the frequency scaled nanosecond monotonic time to get
the guest's CP0_Count. The frequency of the timer is initialised to
100MHz and cannot yet be changed, but a later patch will allow the
frequency to be configured via the KVM_{GET,SET}_ONE_REG ioctl
interface.
The timer can now be stopped via the CP0_Cause.DC bit (by the guest or
via the KVM_SET_ONE_REG ioctl interface), at which point the current
CP0_Count is stored and can be read directly. When it is restarted the
bias is recalculated such that the CP0_Count value is continuous.
Due to the nature of hrtimer interrupts any read of the guest's
CP0_Count register while it is running triggers a check for whether the
hrtimer has expired, so that the guest/userland cannot observe the
CP0_Count passing CP0_Compare without queuing a timer interrupt. This is
also taken advantage of when stopping the timer to ensure that a pending
timer interrupt is queued.
This replaces the implementation of:
- Guest read of CP0_Count
- Guest write of CP0_Count
- Guest write of CP0_Compare
- Guest write of CP0_Cause
- Guest read of HWR 2 (CC) with RDHWR
- Host read of CP0_Count via KVM_GET_ONE_REG ioctl interface
- Host write of CP0_Count via KVM_SET_ONE_REG ioctl interface
- Host write of CP0_Compare via KVM_SET_ONE_REG ioctl interface
- Host write of CP0_Cause via KVM_SET_ONE_REG ioctl interface
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a VCPU is scheduled in on a different CPU, refresh the hrtimer used
for emulating count/compare so that it gets migrated to the same CPU.
This should prevent a timer interrupt occurring on a different CPU to
where the guest it relates to is running, which would cause the guest
timer interrupt not to be delivered until after the next guest exit.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The hrtimer callback for guest timer timeouts sets the guest's
CP0_Cause.TI bit to indicate to the guest that a timer interrupt is
pending, however there is no mutual exclusion implemented to prevent
this occurring while the guest's CP0_Cause register is being
read-modify-written elsewhere.
When this occurs the setting of the CP0_Cause.TI bit is undone and the
guest misses the timer interrupt and doesn't reprogram the CP0_Compare
register for the next timeout. Currently another timer interrupt will be
triggered again in another 10ms anyway due to the way timers are
emulated, but after the MIPS timer emulation is fixed this would result
in Linux guest time standing still and the guest scheduler not being
invoked until the guest CP0_Count has looped around again, which at
100MHz takes just under 43 seconds.
Currently this is the only asynchronous modification of guest registers,
therefore it is fixed by adjusting the implementations of the
kvm_set_c0_guest_cause(), kvm_clear_c0_guest_cause(), and
kvm_change_c0_guest_cause() macros which are used for modifying the
guest CP0_Cause register to use ll/sc to ensure atomic modification.
This should work in both UP and SMP cases without requiring interrupts
to be disabled.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When about to run the guest, deliver guest interrupts after disabling
host interrupts. This should prevent an hrtimer interrupt from being
handled after delivering guest interrupts, and therefore not delivering
the guest timer interrupt until after the next guest exit.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0
HWREna register. This is so that userland can save and restore its
value so that RDHWR instructions don't have to be emulated by the guest.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0
UserLocal register. This is so that userland can save and restore its
value.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement KVM_{GET,SET}_ONE_REG ioctl based access to the guest CP0
Count and Compare registers. These registers are special in that writing
to them has side effects (adjusting the time until the next timer
interrupt) and reading of Count depends on the time. Therefore add a
couple of callbacks so that different implementations (trap & emulate or
VZ) can implement them differently depending on what the hardware
provides.
The trap & emulate versions mostly duplicate what happens when a T&E
guest reads or writes these registers, so it inherits the same
limitations which can be fixed in later patches.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the KVM_{GET,SET}_ONE_REG MIPS register id definitions out of
kvm_mips.c to kvm_host.h so that they can be shared between multiple
source files. This allows register access to be indirected depending on
the underlying implementation (trap & emulate or VZ).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Contrary to the comment, the guest CP0_EPC register cannot be set via
kvm_regs, since it is distinct from the guest PC. Add the EPC register
to the KVM_{GET,SET}_ONE_REG ioctl interface.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: David Daney <david.daney@cavium.com>
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When MIPS KVM needs to write a TLB entry for the guest it reads the
CP0_Random register, uses it to generate the CP_Index, and writes the
TLB entry using the TLBWI instruction (tlb_write_indexed()).
However there's an instruction for that, TLBWR (tlb_write_random()) so
use that instead.
This happens to also fix an issue with Ingenic XBurst cores where the
same TLB entry is replaced each time preventing forward progress on
stores due to alternating between TLB load misses for the instruction
fetch and TLB store misses.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MIPS KVM uses mips32_SyncICache to synchronise the icache with the
dcache after dynamically modifying guest instructions or writing guest
exception vector. However this uses rdhwr to get the SYNCI step, which
causes a reserved instruction exception on Ingenic XBurst cores.
It would seem to make more sense to use local_flush_icache_range()
instead which does the same thing but is more portable.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Export the local_flush_icache_range function pointer for GPL modules so
that it can be used by KVM for syncing the icache after binary
translation of trapping instructions.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Each MIPS KVM guest has its own copy of the KVM exception vector. This
contains the TLB refill exception handler at offset 0x000, the general
exception handler at offset 0x180, and interrupt exception handlers at
offset 0x200 in case Cause_IV=1. A common handler is copied to offset
0x2000 and offset 0x3000 is used for temporarily storing k1 during entry
from guest.
However the amount of memory allocated for this purpose is calculated as
0x200 rounded up to the next page boundary, which is insufficient if 4KB
pages are in use. This can lead to the common handler at offset 0x2000
being overwritten and infinitely recursive exceptions on the next exit
from the guest.
Increase the minimum size from 0x200 to 0x4000 to cover the full use of
the page.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2. Fix flag check for gdb support
3. Remove unnecessary vcpu start
4. Remove code duplication for sigp interrupts
5. Better DAT handling for the TPROT instruction
6. Correct addressing exception for standby memory
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJTiD4+AAoJEBF7vIC1phx88poQALJkg1EZLePQBwuqe3XKeUnV
AxbtP1T5++Pv+JpXj/mTmI4ToE3U+t66iObCZ8GAWZtXNr47yJQRetTqSU1G8BFe
eLpuWhyaQD31awp5M9sWIAzNjqnO4vR0f6kbVeoXbRmPzdZY19QaQb5QIRFMA/GT
hSYlNoftRfM/e2ZRFUvqiMIKFHs2r/f9T+7SMRhu3YeZo42N3KOnIwFQrLCoaP+x
8GBmxfl0OaEU9Rer6/MvyJVdMHD5ap4UyreaB5PLVtUNGIrcAJ2vjQGbF78bRonA
9SwJ0mMEfzJoH6gZXDEby1e/p7MJ80wHf9jMoy2u2qtyvNuQIyBWS1eco7LH5dpd
pxXcViLC8proiXfFK/6N4ra5cr3auMAmA5iX1LjD+ZxhAjlA+WTcC31jSfueDn33
mBpDASLcoJXbPoEhWiAjP8wkgfUAQnr5CPu5lrmloOrj5llYx0FL2gDhX5dXe7op
5orTAcEtfSxqosK/Av2oQnmUU61BkI4UuBwCbpqM/1w8oKcYX4jUE5Yx/Kmej80Z
MtpYUnBDmJaYPogddr8q5QcnRyLEoRUKfKu5lDnlLaP1oi7vr+WBR+BkNLl0/o7o
+XVdruc7OtC/Ahtep8unkLV5C8j1siXergjvHXhHjpGNsIQ2tfD8LO55ST4vYkUd
GAosCgM/6wt8YmK67qc+
=84hG
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-20140530' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next
1. Several minor fixes and cleanups for KVM:
2. Fix flag check for gdb support
3. Remove unnecessary vcpu start
4. Remove code duplication for sigp interrupts
5. Better DAT handling for the TPROT instruction
6. Correct addressing exception for standby memory
Based on original patch from Jeng-fang (Nick) Wang
When standby memory is specified for a guest Linux, but no virtual memory has
been allocated on the Qemu host backing that guest, the guest memory detection
process encounters a memory access exception which is not thrown from the KVM
handle_tprot() instruction-handler function. The access exception comes from
sie64a returning EFAULT, which then passes an addressing exception to the guest.
Unfortunately this does not the proper PSW fixup (nullifying vs.
suppressing) so the guest will get a fault for the wrong address.
Let's just intercept the tprot instruction all the time to do the right thing
and not go the page fault handler path for standby memory. tprot is only used
by Linux during startup so some exits should be ok.
Without this patch, standby memory cannot be used with KVM.
Signed-off-by: Nick Wang <jfwang@us.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Tested-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
This patch removes the start of a VCPU when delivering a RESTART interrupt.
Interrupt delivery is called from kvm_arch_vcpu_ioctl_run. So the VCPU is
already considered started - no need to call kvm_s390_vcpu_start. This function
will early exit anyway.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
This patch fixes a minor bug when updating the guest debug settings.
We should check the given debug flags, not the already set ones.
Doesn't do any harm but too many (for now unused) flags could be set internally
without error.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
We have all the logic to inject interrupts available in
kvm_s390_inject_vcpu(), so let's use it instead of
injecting irqs manually to the list in sigp code.
SIGP stop is special because we have to check the
action_flags before injecting the interrupt. As
the action_flags are not available in kvm_s390_inject_vcpu()
we leave the code for the stop order code untouched for now.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
The TPROT instruction can be used to check the accessability of storage
for any kind of logical addresses. So far, our handler only supported
real addresses. This patch now also enables support for addresses that
have to be translated via DAT first. And while we're at it, change the
code to use the common KVM function gfn_to_hva_prot() to check for the
validity and writability of the memory page.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This patch adds a function for translating logical guest addresses into
physical guest addresses without touching the memory at the given location.
Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Pull ARM fixes from Russell King:
"The usual random collection of relatively small ARM fixes"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8063/1: bL_switcher: fix individual online status reporting of removed CPUs
ARM: 8064/1: fix v7-M signal return
ARM: 8057/1: amba: Add Qualcomm vendor ID.
ARM: 8052/1: unwind: Fix handling of "Pop r4-r[4+nnn],r14" opcode
ARM: 8051/1: put_user: fix possible data corruption in put_user
ARM: 8048/1: fix v7-M setup stack location
When configure kprobe events of ftrace with "stacktrace" option enabled
in arm, there is no stacktrace was recorded after the kprobe event was
triggered. The root cause is no save_stack_trace_regs() function implemented.
Implement the save_stack_trace_regs() function in arm, then ftrace will
call this architecture-related function to record the stacktrace into
ring buffer.
After this fix, stacktrace can be recorded, for example:
# mount -t debugfs nodev /sys/kernel/debug
# echo "p:netrx net_rx_action" >> /sys/kernel/debug/tracing/kprobe_events
# echo 1 > /sys/kernel/debug/tracing/events/kprobes/netrx/enable
# echo 1 > /sys/kernel/debug/tracing/options/stacktrace
# echo 1 > /sys/kernel/debug/tracing/tracing_on
# ping 127.0.0.1 -c 1
# echo 0 > /sys/kernel/debug/tracing/tracing_on
# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 12/12 #P:1
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
<------ missing some entries ---------------->
ping-1200 [000] dNs1 667.603250: netrx: (net_rx_action+0x0/0x1f8)
ping-1200 [000] dNs1 667.604738: <stack trace>
=> net_rx_action
=> do_softirq
=> local_bh_enable
=> ip_finish_output
=> ip_output
=> ip_local_out
=> ip_send_skb
=> ip_push_pending_frames
=> raw_sendmsg
=> inet_sendmsg
=> sock_sendmsg
=> SyS_sendto
=> ret_fast_syscall
Signed-off-by: Lin Yongting <linyongting@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Support for ARM710 CPUs was removed in v3.5. Now remove the last code
depending on its Kconfig macro.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We will reach fixup handler when one thread(say cpu0) caused an undefined exception, while another thread(say cpu1) is unmmaping the page.
Fixup handler returns to the next userspace instruction which has caused the undef execption, rather than going to the same instruction.
ARM ARM says that after undefined exception, the PC will be pointing
to the next instruction. ie +4 offset in case of ARM and +2 in case of Thumb
And there is no correction offset passed to vector_stub in case of
undef exception.
File: arch/arm/kernel/entry-armv.S +1085
vector_stub und, UND_MODE
During an undefined exception, in normal scenario(ie when ldrt
instruction does not cause an abort) after resorting the context in
VFP hardware, the PC is modified as show below before jumping to
ret_from_exception which is in r9.
File: arch/arm/vfp/vfphw.S +169
@ The context stored in the VFP hardware is up to date with this thread
vfp_hw_state_valid:
tst r1, #FPEXC_EX
bne process_exception @ might as well handle the pending
@ exception before retrying branch
@ out before setting an FPEXC that
@ stops us reading stuff
VFPFMXR FPEXC, r1 @ Restore FPEXC last
sub r2, r2, #4 @ Retry current instruction - if Thumb
str r2, [sp, #S_PC] @ mode it's two 16-bit instructions,
@ else it's one 32-bit instruction, so
@ always subtract 4 from the following
@ instruction address.
But if ldrt results in an abort, we reach the fixup handler and return
to ret_from_execption without correcting the pc.
This patch modifes the fixup handler to re-execute the same instruction which caused undefined execption.
Signed-off-by: Vinayak Menon <vinayakm.list@gmail.com>
Signed-off-by: Arun KS <getarunks@gmail.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
asm-generic offers an atomic-add based rwsem implementation, which
can avoid the need for heavier, spinlock-based synchronisation on the
fast path.
This patch makes use of the optimised implementation for ARM CPUs.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As we have now removed all instances of the L2C-310 having its cache
size "modified" via platform/SoC code, discourage new cases showing
up by printing a warning.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We no longer need or require the .set_debug method; we handle everything
it used to do via the .write_sec method instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
L2X0_AUX_CTRL_MASK is not useful for PL310s. It would be better if
people thought about their value for this rather than cargo-cult
programming.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The cache size should already be present in the L2 cache auxiliary
control register: it is part of the integration process to configure
the hardware IP. Most platforms get this right, yet still many
cargo-cult program, and assume that they always need specifying to
the L2 cache code. Remove them so we can find out which really need
this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It is beneficial to have the L2 cache up and running earlier in the
system boot. Not only will this allow for simpler code when we come to
enable some features, but it also means that we get a more accurate
bogomips value for the udelay() loop. Calibrating the loop with the
L2 cache off, and then running with the L2 cache on is not the best
idea.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ux500 can't change the auxiliary control register, so there's no point
passing values to try and modify it to the l2x0 init functions.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The cache size should already be present in the L2 cache auxiliary
control register: it is part of the integration process to configure
the hardware IP. Most platforms get this right, yet still many
cargo-cult program, and assume that they always need specifying to
the L2 cache code. Remove them so we can find out which really need
this.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ux500 can't write to any of the secure registers on the L2C controllers,
so provide a dummy handler which ignores all writes.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead.
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Reviewed-by: Joseph Lo <josephl@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The cache size should already be present in the L2 cache auxiliary
control register: it is part of the integration process to configure
the hardware IP. Most platforms get this right, yet still many
cargo-cult program, and assume that they always need specifying to
the L2 cache code. Remove them so we can find out which really need
this.
Acked-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Stephen Warren <swarren@nvidia.com>
Acked-by: Peter De Schrijver <pdeschrijver@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead. We can remove the .init_machine as it becomes
the same as the generic version.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The cache size should already be present in the L2 cache auxiliary
control register: it is part of the integration process to configure
the hardware IP. Most platforms get this right, yet still many
cargo-cult program, and assume that they always need specifying to
the L2 cache code. Remove them so we can find out which really need
this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The cache size should already be present in the L2 cache auxiliary
control register: it is part of the integration process to configure
the hardware IP. Most platforms get this right, yet still many
cargo-cult program, and assume that they always need specifying to
the L2 cache code. Remove them so we can find out which really need
this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead. This also allows us to eliminate the
.init_machine function as this becomes the same as the generic version.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add better commentry about the L2 cache requirements on these platforms.
Unfortunately, the auxiliary control register is not pre-set to indicate
the correct cache parameters, so we have to manually program these.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead. Along with this change, we can delete l2x0.c
from prima2.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The cache size should already be present in the L2 cache auxiliary
control register: it is part of the integration process to configure
the hardware IP. Most platforms get this right, yet still many
cargo-cult program, and assume that they always need specifying to
the L2 cache code. Remove them so we can find out which really need
this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add support for L2 cache controller (PL310) on AM437x SoC.
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Get rid of init call to initialize L2 cache. Instead use the init_early
machine hook. This helps in using the initialization routine across
SoCs without the need of ugly cpu_is_*() checks.
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
L2 cache initialization for OMAP4 redundantly sets the cache policy to
Round-Robin. This is not needed since thats the PL310 default anyway.
Removing this reduces the number of platform specific aux control
settings.
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Avoid reading directly from the L2 registers in platform code. The L2
code will have already saved the register values itself into the
l2x0_saved_regs structure, so platform code should just move these
values to where they're required.
This is safe because the L2x0 will have been initialised by an early
initcall, whereas the OMAP4 PM code is initialised late.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since we now always enable NS access to the unlock registers, this can
be removed from OMAP4. Remove the NS access bit for the interrupt
registers from OMAP4 as well - nothing in the kernel accesses that yet,
and we can add it in core code when we have the need.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The cache size should already be present in the L2 cache auxiliary
control register: it is part of the integration process to configure
the hardware IP. Most platforms get this right, yet still many
cargo-cult program, and assume that they always need specifying to
the L2 cache code. Remove them so we can find out which really need
this.
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Now that OMAP2 uses the write_sec method, we don't need to enable the L2
cache in OMAP2 specific code; this can be done via the normal mechanisms
in the L2C code. Remove the OMAP2 specific code.
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
With the write_sec method, we no longer need to override the default
L2C disable method, and we no longer need the L2C set_debug method.
Both of these can be handled via the write_sec method.
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead. This also allows us to eliminate the
.init_machine function as it is identical to the generic version.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The cache size should already be present in the L2 cache auxiliary
control register: it is part of the integration process to configure
the hardware IP. Most platforms get this right, yet still many
cargo-cult program, and assume that they always need specifying to
the L2 cache code. Remove them so we can find out which really need
this.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead.
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead. Since the .init_irq method only calls
irqchip_init(), we can remove that too as the generic code will take
care of that.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Now that we handle this in core code, we don't need platforms enabling
the low power modes directly.
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Now that highbank uses the write_sec method, we don't need to enable
the L2 cache in SoC specific code; this can be done via the normal
mechanisms in the L2C code.
Checking with Rob Herring:
> > Can we kill the "highbank_smc1(0x102, 0x1);" here? That means
> > l2x0_of_init() will see the L2 cache disabled, and will try to enable
> > it via the write_sec hook, so it should do the right thing.
>
> Yes, that should work. You should be able to just call l2x0_of_init
> unconditionally. The condition was really to just avoid the smc on
> Midway which does get handled on h/w, but not if running virtualized.
So also drop the DT check too. I'm leaving the config check in place
so that if L2 is disabled, the write_sec hook can be optimised away.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
With the write_sec method, we no longer need to override the default L2C
disable method. This can be handled via the write_sec method instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
exynos was unconditionally calling the L2 cache initialisation from an
early_initcall. This breaks multiplatform kernels. Thankfully,
converting to generic l2c initialisation fixes this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The cache size should already be present in the L2 cache auxiliary
control register: it is part of the integration process to configure
the hardware IP. Most platforms get this right, yet still many
cargo-cult program, and assume that they always need specifying to
the L2 cache code. Remove them so we can find out which really need
this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The cache size should already be present in the L2 cache auxiliary
control register: it is part of the integration process to configure
the hardware IP. Most platforms get this right, yet still many
cargo-cult program, and assume that they always need specifying to
the L2 cache code. Remove them so we can find out which really need
this.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead. We can remove the explicit machine init too
as this becomes identical to the generic version.
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the explicit call to l2x0_of_init(), converting to the generic
infrastructure instead. We can remove the explicit machine init too
as this becomes identical to the generic version.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide a common assembly implementation for PL310 resume code. Certain
platforms need to re-initialise the L2C cache early as it may preserve
data across a S2RAM cycle, and therefore must be enabled along with the
L1 cache and MMU.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add a hook into the core ARM code to perform L2 cache initialisation
in a platform independent manner. Platforms still get to indicate
their auxiliary control register values and mask, but the
initialisation call will now be made from generic code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since we always write to these during the cache initialisation, it is
a good idea to always have the non-secure access bit set. Set it in
core code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since we now automatically enable early BRESP in core L2C-310 code when
we detect a Cortex-A9, we don't need platforms/SoCs to set this bit
explicitly. Instead, they should seek to preserve the value of bit 30
in the auxiliary control register.
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The AXI bus protocol requires that a write response should only be
sent back to the master when the last write has been accepted. Early
BRESP allows the L2C-310 to send the write response as soon as the
store buffer accepts the write address.
Cortex-A9 processors can signal to the L2C-310 that they wish to be
notified early, and if this optimisation is enabled, the L2C-310 can
signal an early write response.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the L2 cache register saving to a more sensible location - after
the cache has been enabled, and fixups have been run. We move the
saving of the auxiliary control register into the ->save function as
well which makes everything operate in a sane and maintainable way.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We have a mixture of different devices with different register layouts,
but we group all the bits together in an opaque mess. Split them out
into those which are L2C-310 specific and ones which refer to earlier
devices. Provide full auxiliary control register definitions.
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than having SoCs work around L2C erratum themselves, move them
into core code. This erratum affects the double linefill feature which
needs to be disabled for r3p0 to r3p1-50rel0.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When Linux is running in the non-secure world, any write to a secure
L2C register will generate an abort. Platforms normally have to call
firmware to work around this. Provide a hook for them to intercept
any L2C secure register write.
l2c_write_sec() avoids writes to secure registers which are already set
to the appropriate value, thus avoiding the overhead of needlessly
calling into the secure monitor.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the L2C-310 errata configuration options to arch/arm/mm/Kconfig
along side the option which enables support for this device.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the way size calculation data (base of way size) out of the
switch statement into the provided initialisation data.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than assuming these are always 8-way, it can be decoded from the
auxillary register in the same manner as L2C-210.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than decoding this from the ID register, store it in the
l2c_init_data structure. This simplifies things some more, and
allows us to better provide further details as to how we're
driving the cache. We print the cache ID value anyway should we
need to precisely identify the cache hardware.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
non-OF initialisation has never been used with any cache controller
which isn't an ARM cache controller, so we can safely get rid of the
old (and buggy) l2x0_*-based operations structure.
This is also the last reference to:
- l2x0_clean_line()
- l2x0_inv_line()
- l2x0_flush_line()
- l2x0_flush_all()
- l2x0_clean_all()
- l2x0_inv_all()
- l2x0_inv_range()
- l2x0_clean_range()
- l2x0_flush_range()
- l2x0_enable()
- l2x0_resume()
so kill those functions too.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The Broadcom L2C-310 devices use ARMs L2C-310 R2P3 or later. These
require no errata workarounds, and so we can directly call the l2c210
functions from their methods.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The L2C-220 is different from the L2C-210 and L2C-310 in that every
operation is a background operation: this means we have to use
spinlocks to protect all operations, and we have to wait for every
operation to complete.
Should a second operation be attempted while a previous operation
is in progress, the response will be an imprecise abort.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Where no errata affect the L2C-310 handlers, they are functionally
equivalent to L2C-210. Re-use the L2C-210 handlers for the L2C-310
part.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Implement L2C-310 erratum 588369 by overriding the invalidate range
and flush range methods in the outer_cache operations structure.
This allows us to sensibly contain the erratum code in one place
without affecting other locations/implemetations.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Implement L2C-310 erratum 727915 by overriding the flush_all method
in the outer_cache operations structure. This allows us to sensibly
contain the erratum code in one place without affecting other
locations or implementations.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add L2C-210 specific cache operation handlers. These are tailored to
the requirements of the L2C-210 cache controller, which doesn't
require any workarounds. We avoid using the way operations during
normal operation, which means we can avoid locking: the only time
we use the way operations are during initialisation, and when
disabling the cache.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the pl310_set_debug() into the l2c-310 code area, and don't hide
it with ifdefs. Rename it to l2c310_set_debug().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The l2x0 unlocking code is only called from l2x0_enable() now, so move
the logic entirely into that function and simplify it.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rename the pl310 save/resume functions to have a l2c310 prefix - this
is it's official name. Use a local cached copy of the l2x0_base
virtual address, and also realise that many of the resume function
tails are the same as the enable functions, so make a call to the
enable function instead of duplicating that code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add the save/resume code hooks to the non-OF implementations as well.
There's no reason for the non-OF implementations to be any different
from the OF implementations.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than putting quirk handling in __l2c_init(), move it out to a
separate function which individual implementations can specify. This
helps to localise the quirks to those implementations which require
them.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than having this hacked into the OF initialiation function, we
can handle this via the enable function instead. While here, clean
up that code and comments a little.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Avoid unnecessary writes to the auxiliary control register if the
register already contains the required value. This allows us to
avoid invoking the platforms secure monitor code unnecessarily.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
We should write the auxillary control register before unlocking: the
write may be necessary to enable non-secure access to the lock
registers.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Providing an enable method gives L2 cache controllers a chance to do
special handling at enable time. This allows us to remove a hack in
l2x0_unlock() for Marvell Aurora L2 caches.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Back in the mists of time, someone decided that it would be a good idea
to group like functions together - so all the save functions in one
place, all the resume functions in another, all the OF parsing functions
some place else.
This makes it difficult to get an overview on what a particular
implementation is doing - grouping an implementations specific functions
together makes more sense, because you can see what it's doing without
the clutter of other implementations.
Organise it according to implementation.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There's no reason this functionality should be specific to DT, so move
it into the common initialisation function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Pass the iomem address into this function so we don't have to keep
accessing it from a global.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than having a boolean and other tricks to disable some bits of
l2x0_init(), split this function into two parts: a common part shared
between OF and non-OF, and the non-OF part.
The common part can take a block of function pointers, and the cache
ID (to cope with Aurora's DT specified ID.) Eliminate the redundant
setting of l2x0_base in the OF case, moving it to the non-OF init
function.
This allows us to localise the OF-specific initialisation handling
from the non-OF handling.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The revision namespace is specific to the L2 cache part, so don't name
these with generic identifiers, use a part specific identifier.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
cache_wait_way() is actually used to wait for a particular mask to
report clear; it's not really got much to do with cache ways at all.
Indeed, it gets used to wait for the C bit to clear on older caches.
Rename this with a more generic function name which better reflects
its purpose: l2c_wait_mask().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide a generic helper function for way based operations. These are
always background operations, and thus have to be waited for before a
new operation is commenced. This helper extracts that requirement from
several locations in the code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Split the cache unlock code out of l2x0_unlock(). We want to be able
to re-use this functionality later.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide a generic function which always calls the set_debug method.
This will be used later in the series as some work-arounds require
that the debug register be written.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rename a few things to help distinguish their function(s):
l2x0_of_data -> l2c_init_data
setup -> of_parse
add of_ prefix to OF specific data
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove NULL initialisers, make these all __initconst structures, and
order their members in the same order as the structure declaration.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
set_pte_at, which correctly handles PTE_WRITE and will mark the
resulting table entry as read-only where appropriate.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJThzGHAAoJEC379FI+VC/ZXKUP/2sJVWkvaON/8wx8j2xLAS1o
DhzFIQVdWirji4EI/qDtPZZSsFbPztE0w4Lg880RW6QdY74S56Gpd40ollm8JXU+
r/GCPMt9JwM8UsmT91MMUQA5qY0iyirG2FyWdihZwdwxhidWYKxBGBZYSq4a4qly
yhqlpBH/OssnLxN/kSn1EvEpylxr/rDlU3r4bKsvlQ61woZuFu+WRZDmYA2fuAWW
cemTXxUmCR70jYPhCuwGA1uBLyKc6QPcssb86iuW8vhhXj3932dDuvfoobGEe7rC
VwilTujQUbBXCP3/b+agbi3q4eabX9Wr1yM8fvLx23k0cRKvMyhqR4xIxKpPDFPk
tarKhpBODER5D7X7na5D/9f8URB2U3EWrwCn2Wijmh113A98ADR+bUpgnZA2wdoR
/ehy4VlTmajESL1CQ8m1PuSTdBNg3Yi0skQ0pOf+dI6vj8pjF53TfSRn32ior4Ow
4/k+FT1jfLJZOncYFI+AyRhmOi+nqzuxZiurH7eukkqnzJ9eWZtO0WnjHBhlqx7X
C1kxyd+2TpkqHUmF/aybgEEdZKN8EtXQevIMKzlemTu8Ptyrao+qqz9NctV4e7AC
I+n2u0KN4QOLygFCsk5KuG5U0IgxoFideMdFNhA3M0CfSke+AfYf3AH70/mIk0Qz
fMvUR3OsK+aOUx/aXVed
=o10I
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"Fix CoW regression for transparent hugepages by routing set_pmd_at to
set_pte_at, which correctly handles PTE_WRITE and will mark the
resulting table entry as read-only where appropriate"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: fix pmd_write CoW brokenness
Update sunxi_defconfig and multi_v7 with all the latest Allwinner
additions.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
v3.16 that's dts fixes for clocks and enabling few features
that were still being discussed earlier:
- A bunch of omap clock related dts fixes queued by Tero Kristo.
- Enable parallel nand on am437x that was not merged earlier as
I requested more information about the muxing for it. And
we need to also enable ecc hardware support for am43xx.
- Enable the modem support for n900 that was dropped earlier
because we had to fix the related hwmod entry first with patch
ARM: OMAP2+: Fix ssi hwmod entry to allow idling.
- And finally, add the omap2 clock dts files. These will allow
us to enable the dt clocks and drop the legacy clocks for omap2
with a follow-up patch once the related clock driver binding
changes are merged.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJThiGbAAoJEBvUPslcq6VzM2MQAItHkURx2sINDYkB/ghkajFV
h8fvmOlFRQjiQ4V+J1MZXIEhU6Tx6xiOTCydhpCRPuDpnXiEAWXcJv42oqbkKeiB
fdlsMbfifnKax2HGPCqAbKNpdpjl65YiKTfpQvAhQ/iFGOHhczWbCKZ+xktk6K7X
mJFI68itHNZsp10NRwWXspzZaCAM2+LEGrXFIz6rryvbZDD07bFRi7bWsLT8MfzU
laTWQAwEjyjayQoBYJZem3c6I3wIiSL2MvB7fg0JOo7Xr6vkAXeaL3OqbKnMrK8D
4cd/Fh2BGPTmKC2bVugkIgkASmDOx3fHdJJ+NmwApcBUgGq1MOtEJ7CDsw36Qu/Y
Wl/5/8zoxvRr5PBM9/D4QPV2qDJ1iSqYx9hVt5DFIBCyJjy33bJpv9+j1017K6OE
RXFb+386yjPGfzwoCXaku9ijyfoBFe+7Ys8iZ10qJbK945klaWU5GA/6oL+dIBtZ
1XA3fiG1znGrks35+944dCuD9EUphrJ5ZJ0eZTc/PpE0QG8m4Kbp1Qr6KB7c07uL
4NgDG2iveWArGG0EtEJelacFb1EjsvLQ4qNNs+0ch5KBB3UB3NQoDRLL7arURyqo
nI8qRRKFgYMWlx7J7kckgNP35wJvLVQYVlExsAFE1YZBU+ABuDoAyta2ZTdE3A9H
5dxeTOYH83NcTV8c0Owq
=X+qx
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v3.16/dt-part3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/dt
Merge "omap dt fixes and and clocks for v3.16 merge window" from Tony Lindgren:
Most likely the last pull request from me for omap changes for
v3.16 that's dts fixes for clocks and enabling few features
that were still being discussed earlier:
- A bunch of omap clock related dts fixes queued by Tero Kristo.
- Enable parallel nand on am437x that was not merged earlier as
I requested more information about the muxing for it. And
we need to also enable ecc hardware support for am43xx.
- Enable the modem support for n900 that was dropped earlier
because we had to fix the related hwmod entry first with patch
ARM: OMAP2+: Fix ssi hwmod entry to allow idling.
- And finally, add the omap2 clock dts files. These will allow
us to enable the dt clocks and drop the legacy clocks for omap2
with a follow-up patch once the related clock driver binding
changes are merged.
* tag 'omap-for-v3.16/dt-part3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap2 clock data
ARM: dts: am437x-gp-evm: add support for parallel NAND flash
ARM: OMAP2+: gpmc: enable BCH_HW ecc-scheme for AM43xx platforms
ARM: dts: omap3 a83x: fix duplicate usb pin config
ARM: dts: omap3: set mcbsp2 status
ARM: dts: omap3-n900: Add modem support
ARM: dts: omap3-n900: Add SSI support
ARM: OMAP2+: Fix ssi hwmod entry to allow idling
ARM: dts: AM4372: clk: efuse based crystal frequency detect
ARM: dts: am43xx-clocks.dtsi: add ti, set-rate-parent to display clock path
ARM: dts: omap5-clocks.dtsi: add ti, set-rate-parent to dss_dss_clk
ARM: dts: omap4: add twd clock to DT
ARM: dts: omap54xx-clocks: Correct abe_iclk clock node
ARM: dts: omap54xx-clocks: remove the autoidle properties for clock nodes
ARM: dts: am43x-clock: add tbclk data for ehrpwm
ARM: dts: am33xx-clock: Fix ehrpwm tbclk data
ARM: dts: set 'ti,set-rate-parent' for dpll4_m5 path
ARM: dts: use ti,fixed-factor-clock for dpll4_m5x2_mul_ck
ARM: dts: am43xx-clocks: use ti, fixed-factor-clock for dpll_per_clkdcoldo
Signed-off-by: Olof Johansson <olof@lixom.net>
When targetting ARCH_MULTIPLATFORM, we may include support for SoCs with
PCI-capable devices (e.g. mach-virt with virtio-pci).
This patch allows PCI support to be selected for these SoCs by selecting
CONFIG_MIGHT_HAVE_PCI when CONFIG_ARCH_MULTIPLATFORM=y and removes the
individual selections from multi-platform enabled SoCs.
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
The memory alias support has been removed since a1f4d39500 (KVM: Remove
memory alias support). So remove unalias_gfn from the MIPS port.
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- From Daniel Lezcano:
This patchset relies on the cpm_pm notifier to initiate the
powerdown sequence operations from pm.c instead cpuidle.c.
Thus the cpuidle driver is no longer dependent from arch
specific code as everything is called from the pm.c file.
Note, this is based on tags/exnos-mcpm and tags/samsung-clk
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJTgm6MAAoJEA0Cl+kVi2xqXP0QAJm+Hu8vuwDd5XYzDwsbmzm0
lm173M2H22ZGT3jDcI77lAwBjoKnJE0Y/+h3Do8hOHtrX08TG9ziUf29dRfV2Vk/
svrf4i0W9MmB1v23WWomH1qgKWE8OAg7dJO3Snwr4ZvhgzFLirNiCzA2P03TmqWV
7fRc+Yauv+6+WkZGAhEqxC5sza+gAdyQarhoBVOB/cK3CVDfh1b31SMvWonnKwAW
676mGU5AdjETiUZZ9eA0Dhh3r1lSyyXmWUtdtDYulhyMt0uHdiJSir5tt9Elt1n3
i7jjNxyVe78WWe9sFd1xZBCuDH3gdVlYGHpmV35NXA8yOGs6XQju5bKh8LHOejoH
LaK9kzITdFo4GIJZpmmk/lx+c8EwL/bF1w6+FKrDWoKidv4bpYaRPIBMUqebSZb+
jGrC5ox5TzURGRUZ27iszePgcTcOEMWhPyUjr5yarL9r8Czhdmyl7S01jyrRAyNm
6FLl7yL1BXdk2qPk56ke9Ce7md0Db/nJ5l0fqWOy0TrOYAfGM0H4EMeP6GrfOXXi
wVIPOoUtH23kjCoazE2hgULCinwUPQ1SLnvNANnL0uFO1AJBqeAT2/n26ssXGA1I
0qg+6ULMZe8v1byxwmvAZxixTERZHxbrIBUgpkukf2mqSFRJURgm0aNDQJORkysQ
QnFRo+kmDXqJbPpBbUkw
=RacH
-----END PGP SIGNATURE-----
Merge tag 'exynos-cpuidle' of http://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into next/drivers
Merge "Samsung exynos-cpuidle updates for v3.16" from Kukjin Kim:
- From Daniel Lezcano:
This patchset relies on the cpm_pm notifier to initiate the
powerdown sequence operations from pm.c instead cpuidle.c.
Thus the cpuidle driver is no longer dependent from arch
specific code as everything is called from the pm.c file.
* tag 'exynos-cpuidle' of http://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung: (94 commits)
ARM: EXYNOS: Fix kernel panic when unplugging CPU1 on exynos
ARM: EXYNOS: Move the driver to drivers/cpuidle directory
ARM: EXYNOS: Cleanup all unneeded headers from cpuidle.c
ARM: EXYNOS: Pass the AFTR callback to the platform_data
ARM: EXYNOS: Move S5P_CHECK_SLEEP into pm.c
ARM: EXYNOS: Move the power sequence call in the cpu_pm notifier
ARM: EXYNOS: Move the AFTR state function into pm.c
ARM: EXYNOS: Encapsulate the AFTR code into a function
ARM: EXYNOS: Disable cpuidle for exynos5440
ARM: EXYNOS: Encapsulate boot vector code into a function for cpuidle
ARM: EXYNOS: Pass wakeup mask parameter to function for cpuidle
ARM: EXYNOS: Remove ifdef for scu_enable in pm
ARM: EXYNOS: Move scu_enable in the cpu_pm notifier
ARM: EXYNOS: Use the cpu_pm notifier for pm
ARM: EXYNOS: Fix S5P_WAKEUP_STAT call for cpuidle
ARM: EXYNOS: Move some code inside the idle_finisher for cpuidle
ARM: EXYNOS: Encapsulate register access inside a function for pm
ARM: EXYNOS: Change function name prefix for cpuidle
ARM: EXYNOS: Use cpuidle_register
ARM: EXYNOS: Prevent forward declaration for cpuidle
...
Signed-off-by: Olof Johansson <olof@lixom.net>