Andy will need the following scheduler fix for the PCID series:
252d2a4117: sched/core: Idle_task_exit() shouldn't use switch_mm_irqs_off()
So do a cross-merge.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull Xtensa fixes from Max Filippov:
- don't use linux IRQ #0 in legacy irq domains: fixes timer interrupt
assignment when it's hardware IRQ # is 0 and the kernel is built w/o
device tree support
- reduce reservation size for double exception vector literals from 48
to 20 bytes: fixes build on cores with small user exception vector
- cleanups: use kmalloc_array instead of kmalloc in simdisk_init and
seq_puts instead of seq_printf in c_show.
* tag 'xtensa-20170612' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: don't use linux IRQ #0
xtensa: reduce double exception literal reservation
xtensa: ISS: Use kmalloc_array() in simdisk_init()
xtensa: Use seq_puts() in c_show()
Pull s390 fixes from Martin Schwidefsky:
- A fix for KVM to avoid kernel oopses in case of host protection
faults due to runtime instrumentation
- A fix for the AP bus to avoid dead devices after unbind / bind
- A fix for a compile warning merged from the vfio_ccw tree
- Updated default configurations
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: update defconfig
s390/zcrypt: Fix blocking queue device after unbind/bind.
s390/vfio_ccw: make some symbols static
s390/kvm: do not rely on the ILC on kvm host protection fauls
This adds code to save the values of three SPRs (special-purpose
registers) used by userspace to control event-based branches (EBBs),
which are essentially interrupts that get delivered directly to
userspace. These registers are loaded up with guest values when
entering the guest, and their values are saved when exiting the
guest, but we were not saving the host values and restoring them
before going back to userspace.
On POWER8 this would only affect userspace programs which explicitly
request the use of EBBs and also use the KVM_RUN ioctl, since the
only source of EBBs on POWER8 is the PMU, and there is an explicit
enable bit in the PMU registers (and those PMU registers do get
properly context-switched between host and guest). On POWER9 there
is provision for externally-generated EBBs, and these are not subject
to the control in the PMU registers.
Since these registers only affect userspace, we can save them when
we first come in from userspace and restore them before returning to
userspace, rather than saving/restoring the host values on every
guest entry/exit. Similarly, we don't need to worry about their
values on offline secondary threads since they execute in the context
of the idle task, which never executes in userspace.
Fixes: b005255e12 ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08)
Cc: stable@vger.kernel.org # v3.14+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Replace read tick function pointers with the new hot-patched get_tick().
This optimizes the performance of functions such as: sched_clock()
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In Linux it is possible to configure printk() to output timestamp next to
every line. This is very useful to determine the slow parts of the boot
process, and also to avoid regressions, as boot time is visiable to
everyone.
Also, there are scripts that change these time stamps to intervals.
However, on larger machines these timestamps start appearing many seconds,
and even minutes into the boot process. This patch gets stick-frequency
property early from OpenBoot, and uses its value to initialize time stamps
before the first printk() messages are printed.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch prepares the code for early boot time stamps by making it more
modular.
- init_tick_ops() to initialize struct sparc64_tick_ops
- new sparc64_tick_ops operation get_frequency() which returns a
frequency
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In clock sched we now have three loads:
- Function pointer
- quotient for multiplication
- offset
However, it is possible to improve performance substantially, by
guaranteeing that all three loads are from the same cacheline.
By moving these three values first in sparc64_tick_ops, and by having
tick_operations 64-byte aligned we guarantee this.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On most platforms, time is shown from the beginning of boot. This patch is
adding offset to sched_clock() for SPARC, to also show time from 0.
This means we will have one more load, but we saved one in an ealier patch.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In timer_64.c tick functions are access via pointer (tick_ops), every time
clock is read, there is one extra load to get to the function.
This patch optimizes it, by accessing functions pointer from value.
Current ched_clock():
sethi %hi(0xb9b400), %g1
ldx [ %g1 + 0x250 ], %g1 ! <tick_ops>
ldx [ %g1 ], %g1
call %g1
nop
sethi %hi(0xb9b400), %g1
ldx [ %g1 + 0x300 ], %g1 ! <timer_ticks_per_nsec_quotient>
mulx %o0, %g1, %g1
rett %i7 + 8
srlx %g1, 0xa, %o0
New sched_clock():
sethi %hi(0xb9b400), %g1
ldx [ %g1 + 0x340 ], %g1
call %g1
nop
sethi %hi(0xb9b400), %g1
ldx [ %g1 + 0x378 ], %g1
mulx %o0, %g1, %g1
rett %i7 + 8
srlx %g1, 0xa, %o0
Before three loads, now two loads.
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to use the serial and ethernet USB gadget support on
Raspberry Zero, we also need to enable the PHY driver, kernel module
and OTG support.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Since 635c21068cf ("usb: dwc2: gadget: Fix WARN_ON messages
during FIFO init") the dwc2 driver is able to handle OTG and gadget
mode for bcm2835. So enable this feature for the Raspberry Pi Zero.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
In order to use dwc2 in OTG or gadget mode the USB PHY should be
specified. Since there is no bcm283x USB PHY driver use the generic
one.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
The Raspberry Pi Zero also supports OTG mode. So provide a dtsi file
to configure the USB interface accordingly. The fifo sizes are optimized
for device endpoint 6 and 7 with the maximum of 768.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Cygnus has a single amac controller connected to the B53 switch with 2
PHYs. On the BCM911360_EP platform, those two PHYs are connected to the
external ethernet jacks.
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Add thermal support via the ns-thermal driver and create a single
thermal zone for the entire SoC.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
This loads the VC4 driver on the 911360_entphn platform (with the
corresponding series sent to dri-devel), which is supported by master
of the Mesa tree.
Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Northstar devices have MDIO bus that may contain various PHYs attached.
A common example is USB 3.0 PHY (that doesn't have an MDIO driver yet).
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
This uses CPU thermal sensor available on every Northstar chipset to
monitor temperature. We don't have any cooling or throttling so only a
critical trip was added.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Re-organise the perf accounting for fault handling in preparation for
enabling handling of hardware poison faults in subsequent commits. The
change updates perf accounting to be inline with the behaviour on
x86.
With this update, the perf fault accounting -
* Always report PERF_COUNT_SW_PAGE_FAULTS
* Doesn't report anything else for VM_FAULT_ERROR (which includes
hwpoison faults)
* Reports PERF_COUNT_SW_PAGE_FAULTS_MAJ if it's a major
fault (indicated by VM_FAULT_MAJOR)
* Otherwise, reports PERF_COUNT_SW_PAGE_FAULTS_MIN
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Add VM_FAULT_HWPOISON[_LARGE] handling to the arm64 page fault
handler. Handling of VM_FAULT_HWPOISON[_LARGE] is very similar
to VM_FAULT_OOM, the only difference is that a different si_code
(BUS_MCEERR_AR) is passed to user space and si_addr_lsb field is
initialized.
Signed-off-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
(fix new __do_user_fault call-site)
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Tested-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When memory failure is enabled, a poisoned hugepage pte is marked as a
swap entry. huge_pte_offset() does not return the poisoned page table
entries when it encounters PUD/PMD hugepages.
This behaviour of huge_pte_offset() leads to error such as below when
munmap is called on poisoned hugepages.
[ 344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074.
Fix huge_pte_offset() to return the poisoned pte which is then
appropriately handled by the generic layer code.
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Woods <dwoods@mellanox.com>
Tested-by: Manoj Iyer <manoj.iyer@canonical.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Cache support is optional feature in M-class cores, thus DminLine or
IminLine of Cache Type Register is zero if caches are not implemented,
but we check the whole CTR which has other features encoded there.
Let's be more precise and check for DminLine and IminLine of CTR
before we set cacheid.
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
When both enable CONFIG_ARM_LPAE=y and CONFIG_VMSPLIT_3G_OPT=y, which
means use PAGE_OFFSET=0xB0000000 with ARM_LPAE, the kernel will boot
fail and stop after uncompressed:
Starting kernel ...
Uart base = 0x20001000
watchdog reg = 0x20013000
dtb addr = 0x80840308
Uncompressing Linux... done, booting the kernel.
For ARM_LPAE only support 3:1, 2:2, 1:3 split of TTBR1, which mention in:
http://elinux.org/images/6/6a/Elce11_marinas.pdf - p16
So we should make VMSPLIT_3G_OPT depends on !ARM_LPAE to avoid trigger
this bug.
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Commit 06a4b6d009 ("ARM: 8677/1: boot/compressed: fix decompressor
header layout for v7-M") fixed an issue in the layout of the header
of the compressed kernel image that was caused by the assembler
emitting narrow opcodes for 'mov r0, r0', and for this reason, the
mnemonic was updated to use the W() macro, which will append the .w
suffix (which forces a wide encoding) if required, i.e., when building
the kernel in Thumb2 mode.
However, this failed to take into account that on Thumb2 kernels built
for CPUs that are also ARM capable, the entry point is entered in ARM
mode, and so the instructions emitted here will be ARM instructions
that only exist in a wide encoding to begin with, which is why the
assembler rejects the .w suffix here and aborts the build with the
following message:
head.S: Assembler messages:
head.S:132: Error: width suffixes are invalid in ARM mode -- `mov.w r0,r0'
So replace the W(mov) with separate ARM and Thumb2 instructions, where
the latter will only be used for THUMB2_ONLY builds.
Fixes: 06a4b6d009 ("ARM: 8677/1: boot/compressed: fix decompressor ...")
Reported-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
We've already got a few conflicts and upcoming work depends on some of the
changes that have gone into mainline as regression fixes for this series.
Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream
trees to continue working on 4.13 changes.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Rename a couple of the struct psw_bits members so it is more obvious
for what they are good. Initially I thought using the single character
names from the PoP would be sufficient and obvious, but admittedly
that is not true.
The current implementation is not easy to use, if one has to look into
the source file to figure out which member represents the 'per' bit
(which is the 'r' member).
Therefore rename the members to sane names that are identical to the
uapi psw mask defines:
r -> per
i -> io
e -> ext
t -> dat
m -> mcheck
w -> wait
p -> pstate
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The address space enums that must be used when modifying the address
space part of a psw with the psw_bits() macro can easily be confused
with the psw defines that are used to mask and compare directly the
mask part of a psw.
We have e.g. PSW_AS_PRIMARY vs PSW_ASC_PRIMARY.
To avoid confusion rename the PSW_AS_* enums to PSW_BITS_AS_*.
In addition also rename the PSW_AMODE_* enums, so they also follow the
same naming scheme: PSW_BITS_AMODE_*.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Right now the kernel uses the primary address space until finally the
switch to the correct home address space will be done when the idle
PSW will be loaded within psw_idle().
Correct this and simply use the home address space when DAT is enabled
for the first time.
This doesn't really fix a bug, but fixes odd behavior.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This reverts the two commits
7afbeb6df2 ("s390/ipl: always use load normal for CCW-type re-IPL")
0f7451ff3a ("s390/ipl: use load normal for LPAR re-ipl")
The two commits did not take into account that behavior of standby
memory changes fundamentally if the re-IPL method is changed from
Load Clear to Load Normal.
In case of the old re-IPL clear method all memory that was initially
in standby state will be put into standby state again within the
re-IPL process. Or in other words: memory that was brought online
before a re-IPL will be offline again after a reboot.
Given that we use different re-IPL methods depending on the hypervisor
and CCW-type vs SCSI re-IPL it is not easy to tell in advance when and
why memory will stay online or will be offline after a re-IPL.
This does also have other side effects, since memory that is online
from the beginning will be in ZONE_NORMAL by default vs ZONE_MOVABLE
for memory that is offline.
Therefore, before the change, a user could online and offline memory
easily since standby memory was always in ZONE_NORMAL. After the
change, and a re-IPL, this depended on which memory parts were online
before the re-IPL.
From a usability point of view the current behavior is more than
suboptimal. Therefore revert these changes until we have a better
solution and get back to a consistent behavior. The bad thing about
this is that the time required for a re-IPL will be significantly
increased for configurations with several 100GB or 1TB of memory.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove raw stack dumps that are printed before call traces in case of
a warning, or the 'l' sysrq trigger (show a stack backtrace for all
active CPUs).
Besides that a raw stack dump should not be shown for the 'l' sysrq
trigger the value of the dump is close to zero. That's also why we
don't print it in case of a panic since ages anymore. That this is
still printed on warnings is just a leftover. So get rid of this
completely.
The following won't be printed anymore with this change:
Stack:
00000000bbc4fbc8 00000000bbc4fc58 0000000000000003 0000000000000000
00000000bbc4fcf8 00000000bbc4fc70 00000000bbc4fc70 0000000000000020
000000007fe00098 00000000bfe8be00 00000000bbc4fe94 000000000000000a
000000000000000c 00000000bbc4fcc0 0000000000000000 0000000000000000
000000000095b930 0000000000113366 00000000bbc4fc58 00000000bbc4fca0
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Move the CONFIG_PCI device so that ioremap and iounmap are always
available. This looks safe as there's nothing PCI specific in the
implementation of these functions.
I have designs to use these functions in scatterlist.c where they'd likely
never be called without CONFIG_PCI set, but this is needed to compile
such changes.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Command 'perf list pmu' displays events which contain
an invalid string "(null)=xxx", where xxx is the pmu event
name, for example:
cpum_cf/AES_BLOCKED_CYCLES,(null)=AES_BLOCKED_CYCLES/
This is not correct, the invalid string should not be
displayed at all.
It is caused by an obsolete term in the
sysfs attribute file for each s390 CPUMF counter event.
Reading from the sysfs file also displays the event
name.
Fix this by omitting the event name. This patch makes
s390 CPUMF sysfs files consistent with other plattforms.
This is an interface change between user and kernel
but does not break anything. Reading from a counter event
sysfs file should only list terms mentioned in the
/sys/bus/event_source/devices/<cpumf>/format directory.
Name is not listed.
Reported-by: Zvonko Kosic <zvonko.kosic@de.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>