1. also provide the floating point registers via sync regs
2. Separate out intruction vs. data accesses
3. Fix program interrupts in some cases
4. Documentation fixes
5. dirty log improvements for huge guests
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJWvJjcAAoJEBF7vIC1phx8VZ8P/3eX+dJU4yPTn44/A8i/hs7V
jQNmnnZfu26oc2bE9X5L4U5flTccP2BZwgJiiOvsM44fWUMbW/4d8RvwkvlN9Iwx
lQKHsjUUUdn3olxLuR4xWstHw/KpubFaeMl+b4VsSx6N2TRhodJBPpao+mrTQUq6
owuzAatrTzyBFnr8dadRjm30t5Lcd8rT+JKpnw+Bj7OJiCo4rnTS9aFF5chkMI+j
6Sri8bKBMw2+rjPF5kL8u3+iigfowcEWV4lout3NyFukCBBJY29nYjfVvl6AYHyX
97E4BBTInpGra3Wv4kmbvotR/epcXTqHVJgx/7MbWU3+G63dlx+Rt77e82n18sRf
/bV4hpfXeRTLfUYmzD5k7rVZrMKplHoCz5gNNh7pPf5Kyzy6orovUH2Sw8GgtTaP
28kiHRT8LR4GwPc2fZ6HOz+PRXnKdScWcUYha8rJpnxF8YJ2G6WEhREOLJZFqOkf
MBTJrbhO8immy4X7473NPmTbNO/1W0m56mD9HrVjvxHKYcot9LfxbVRZDv2+ledF
5e5jpiLQvsHC9e6HZ9ByAaK9KUuNeSMGs/EHb+3MTNDXiqQW3KVxgK/JNO/mG2X1
67x+cGyPWlk+cdWOtfg8W4J8ZCgRpcuMpgrO4TKd0VK13J/g6UqSqSFgJYyTqrgd
glrT5mABhJ9QjTSri05D
=xy2f
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-next-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: Fixes and features for kvm/next (4.6)
1. also provide the floating point registers via sync regs
2. Separate out intruction vs. data accesses
3. Fix program interrupts in some cases
4. Documentation fixes
5. dirty log improvements for huge guests
This adds real and virtual mode handlers for the H_PUT_TCE_INDIRECT and
H_STUFF_TCE hypercalls for user space emulated devices such as IBMVIO
devices or emulated PCI. These calls allow adding multiple entries
(up to 512) into the TCE table in one call which saves time on
transition between kernel and user space.
The current implementation of kvmppc_h_stuff_tce() allows it to be
executed in both real and virtual modes so there is one helper.
The kvmppc_rm_h_put_tce_indirect() needs to translate the guest address
to the host address and since the translation is different, there are
2 helpers - one for each mode.
This implements the KVM_CAP_PPC_MULTITCE capability. When present,
the kernel will try handling H_PUT_TCE_INDIRECT and H_STUFF_TCE if these
are enabled by the userspace via KVM_CAP_PPC_ENABLE_HCALL.
If they can not be handled by the kernel, they are passed on to
the user space. The user space still has to have an implementation
for these.
Both HV and PR-syle KVM are supported.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Upcoming multi-tce support (H_PUT_TCE_INDIRECT/H_STUFF_TCE hypercalls)
will validate TCE (not to have unexpected bits) and IO address
(to be within the DMA window boundaries).
This introduces helpers to validate TCE and IO address. The helpers are
exported as they compile into vmlinux (to work in realmode) and will be
used later by KVM kernel module in virtual mode.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
SPAPR_TCE_SHIFT is used in few places only and since IOMMU_PAGE_SHIFT_4K
can be easily used instead, remove SPAPR_TCE_SHIFT.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
At the moment pages used for TCE tables (in addition to pages addressed
by TCEs) are not counted in locked_vm counter so a malicious userspace
tool can call ioctl(KVM_CREATE_SPAPR_TCE) as many times as
RLIMIT_NOFILE and lock a lot of memory.
This adds counting for pages used for TCE tables.
This counts the number of pages required for a table plus pages for
the kvmppc_spapr_tce_table struct (TCE table descriptor) itself.
This changes release_spapr_tce_table() to store @npages on stack to
avoid calling kvmppc_stt_npages() in the loop (tiny optimization,
probably).
This does not change the amount of used memory.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
At the moment only spapr_tce_tables updates are protected against races
but not lookups. This fixes missing protection by using RCU for the list.
As lookups also happen in real mode, this uses
list_for_each_entry_lockless() (which is expected not to access any
vmalloc'd memory).
This converts release_spapr_tce_table() to a RCU scheduled handler.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This reworks the existing H_PUT_TCE/H_GET_TCE handlers to have following
patches applied nicer.
This moves the ioba boundaries check to a helper and adds a check for
least bits which have to be zeros.
The patch is pretty mechanical (only check for least ioba bits is added)
so no change in behaviour is expected.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This makes vmalloc_to_phys() public as there will be another user
(KVM in-kernel VFIO acceleration) for it soon. As this new user
can be compiled as a module, this exports the symbol.
As a little optimization, this changes the helper to call
vmalloc_to_pfn() instead of vmalloc_to_page() as the size of the
struct page may not be power-of-two aligned which will make gcc use
multiply instructions instead of shifts.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
ec53500f "kvm: Add VFIO device" added a special KVM pseudo-device which is
used to handle any necessary interactions between KVM and VFIO.
Currently that device is built on x86 and ARM, but not powerpc, although
powerpc does support both KVM and VFIO. This makes things awkward in
userspace
Currently qemu prints an alarming error message if you attempt to use VFIO
and it can't initialize the KVM VFIO device. We don't want to remove the
warning, because lack of the KVM VFIO device could mean coherency problems
on x86. On powerpc, however, the error is harmless but looks disturbing,
and a test based on host architecture in qemu would be ugly, and break if
we do need the KVM VFIO device for something important in future.
There's nothing preventing the KVM VFIO device from being built for
powerpc, so this patch turns it on. It won't actually do anything, since
we don't define any of the arch_*() hooks, but it will make qemu happy and
we can extend it in future if we need to.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A KVM_GET_DIRTY_LOG ioctl might take a long time.
This can result in fatal signals seemingly being ignored.
Lets bail out during the dirty bit sync, if a fatal signal
is pending.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Dirty log query can take a long time for huge guests.
Holding the mmap_sem for very long times can cause some unwanted
latencies.
Turns out that we do not need to hold the mmap semaphore.
We hold the slots_lock for gfn->hva translation and walk the page
tables with that address, so no need to look at the VMAs. KVM also
holds a reference to the mm, which should prevent other things
going away. During the walk we take the necessary ptl locks.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The interface for adapter mappings was designed with code in mind
that maps each address only once; let's document this.
Otherwise, duplicate mappings are added to the list, which makes
the code ineffective and uses up the limited amount of mapping
needlessly.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Let's properly document KVM_S390_VM_CRYPTO and its attributes.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Let's properly document KVM_S390_VM_TOD and its attributes.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Since commit 9977e886cb ("s390/kernel: lazy restore fpu registers"),
vregs in struct sie_page is unsed. We can safely remove the field and
the definition.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
On instruction-fetch exceptions, we have to forward the PSW by any
valid ilc and correctly use that ilc when injecting the irq. Injection
will already take care of rewinding the PSW if we injected a nullifying
program irq, so we don't need special handling prior to injection.
Until now, autodetection would have guessed an ilc of 0.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
On SIE faults, the ilc cannot be detected automatically, as the icptcode
is 0. The ilc indicated in the program irq will always be 0. Therefore we
have to manually specify the ilc in order to tell the guest which ilen was
used when forwarding the PSW.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Program irq injection during program irq intercepts is the last candidates
that injects nullifying irqs and relies on delivery to do the right thing.
As we should not rely on the icptcode during any delivery (because that
value will not be migrated), let's add a flag, telling prog IRQ delivery
to not rewind the PSW in case of nullifying prog IRQs.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
__extract_prog_irq() is used only once for getting the program check data
in one place. Let's combine it with an injection function to avoid a memset
and to prevent misuse on injection by simplifying the interface to only
have the VCPU as parameter.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Let's use our fresh new function read_guest_instr() to access
guest storage via the correct addressing schema.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
When an instruction is to be fetched, special handling applies to
secondary-space mode and access-register mode. The instruction is to be
fetched from primary space.
We can easily support this by selecting the right asce for translation.
Access registers will never be used during translation, so don't
include them in the interface. As we only want to read from the current
PSW address for now, let's also hide that detail.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We will need special handling when fetching instructions, so let's
introduce new guest access modes GACC_FETCH and GACC_STORE instead
of a write flag. An additional patch will then introduce GACC_IFETCH.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We have to migrate the program irq ilc and someday we will have to
specify the ilc without KVM trying to autodetect the value.
Let's reuse one of the spare fields in our program irq that should
always be set to 0 by user space. Because we also want to make use
of 0 ilcs ("not available"), we need a validity indicator.
If no valid ilc is given, we try to autodetect the ilc via the current
icptcode and icptstatus + parameter and store the valid ilc in the
irq structure.
This has a nice effect: QEMU's making use of KVM_S390_IRQ /
KVM_S390_SET_IRQ_STATE / KVM_S390_GET_IRQ_STATE for migration will
directly migrate the ilc without any changes.
Please note that we use bit 0 as validity and bit 1,2 for the ilc, so
by applying the ilc mask we directly get the ilen which is usually what
we work with.
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
We have some confusion about ilc vs. ilen in our current code. So let's
correctly use the term ilen when dealing with (ilc << 1).
Program irq injection didn't take care of the correct ilc in case of
irqs triggered by EXECUTE functions, let's provide one function
kvm_s390_get_ilen() to take care of all that.
Also, manually specifying in intercept handlers the size of the
instruction (and sometimes overwriting that value for EXECUTE internally)
doesn't make too much sense. So also provide the functions:
- kvm_s390_retry_instr to retry the currently intercepted instruction
- kvm_s390_rewind_psw to rewind the PSW without internal overwrites
- kvm_s390_forward_psw to forward the PSW
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
As we already store the floating point registers in the vector save area
in floating point register format when we don't have MACHINE_HAS_VX, we can
directly expose them to user space using a new sync flag.
The floating point registers will be valid when KVM_SYNC_FPRS is set. The
fpc will also be valid when KVM_SYNC_FPRS is set.
Either KVM_SYNC_FPRS or KVM_SYNC_VRS will be enabled, never both.
Let's also change two positions where we access vrs, making the code easier
to read and one comment superfluous.
Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
If we have MACHINE_HAS_VX, the floating point registers are stored
in the vector register format, event if the guest isn't enabled for vector
registers. So we can allow KVM_SYNC_VRS as soon as MACHINE_HAS_VX is
available.
This can in return be used by user space to support floating point
registers via struct kvm_run when the machine has vector registers.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Different pieces of code checked for vcpu->arch.apic being (non-)NULL,
or used kvm_vcpu_has_lapic (more optimized) or lapic_in_kernel.
Replace everything with lapic_in_kernel's name and kvm_vcpu_has_lapic's
implementation.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do for kvm_cpu_has_pending_timer and kvm_inject_pending_timer_irqs
what the other irq.c routines have been doing.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Usually the in-kernel APIC's existence is checked in the caller. Do not
bother checking it again in lapic.c.
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add host irq information in trace event, so we can better understand
which irq is in posted mode.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use vector-hashing to deliver lowest-priority interrupts for
VT-d posted-interrupts. This patch extends kvm_intr_is_single_vcpu()
to support lowest-priority handling.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use vector-hashing to deliver lowest-priority interrupts, As an
example, modern Intel CPUs in server platform use this method to
handle lowest-priority interrupts.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the interrupt is not single destination any more, we need
to change back IRTE to remapped mode explicitly.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is similar to the existing div_frac function, but it returns the
remainder too. Unlike div_frac, it can be used to implement long
division, e.g. (a << 64) / b for 32-bit a and b.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 29bb45f25f (regmap-mmio: Use native endianness for read/write)
attempted to fix some long standing bugs in the MMIO implementation for
big endian systems caused by duplicate byte swapping in both regmap and
readl()/writel() which affected MIPS systems as when they are in big
endian mode they flip the endianness of all registers in the system, not
just the CPU. MIPS systems had worked around this by declaring regmap
using IPs as little endian which is inaccurate, unfortunately the issue
had not been reported.
Sadly the fix makes things worse rather than better. By changing the
behaviour to match the documentation it caused behaviour changes for
other IPs which broke them and by using the __raw I/O accessors to avoid
the endianness swapping in readl()/writel() it removed some memory
ordering guarantees and could potentially generate unvirtualisable
instructions on some architectures.
Unfortunately sorting out all this mess in any half way sensible fashion
was far too invasive to go in during an -rc cycle so instead let's go
back to the old broken behaviour for v4.5, the better fixes are already
queued for v4.6. This does mean that we keep the broken MIPS DTs for
another release but that seems the least bad way of handling the
situation.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWtIjbAAoJECTWi3JdVIfQs8QH/jNpfio4klDkdlH/KpPZXlrp
FzASbGePNtLqZXFL5WcG//ni3EYdbaiXZIdLBKDx9K4F2ca9FAF8aAnZAZ5uefGx
bnloYpV34DqQwS5f9FrrNsm+YVTTuUIt0dx4ZRGCEdMTzW7i3efs/9eVEITUixK6
U1obTJovAl33bihadsC9hzJVwfOq3H4aFFWc/EFZzbQaU2/so2eiA1dhPr/YErRJ
dMR8drWxpYXuBsrk5T647R0sUw7pA4Zw+WAF032TPQf/1Fy9Vk1/yXbTyccZzFzo
bfupRA/HpeLNZ9cN9z9y3Fa0je4UNbBZh0poB5B773af84NnhX7Ytenjo+peVxI=
=+Q6E
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v4.5-big-endian' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"A single revert back to v4.4 endianness handling.
Commit 29bb45f25f ("regmap-mmio: Use native endianness for
read/write") attempted to fix some long standing bugs in the MMIO
implementation for big endian systems caused by duplicate byte
swapping in both regmap and readl()/writel(). Sadly the fix makes
things worse rather than better, so revert it for now"
* tag 'regmap-fix-v4.5-big-endian' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: mmio: Revert to v4.4 endianness handling
Fix the doubled "started" and tidy up the following sentences.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A few random fixes, mostly coming from the PMU work by Shannon:
- fix for injecting faults coming from the guest's userspace
- cleanup for our CPTR_EL2 accessors (reserved bits)
- fix for a bug impacting perf (user/kernel discrimination)
- fix for a 32bit sysreg handling bug
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWqehPAAoJECPQ0LrRPXpDn6gP/2lrJ9lV5I3MxLzUytmRY8EM
Xl8WnNEQJ0e7oEdb1l6k4DR8D/HefzXpp/YWHY1WdDZSej0b2egro1xsFWdgaOr9
NVGJnoQBlCFqSIf2szml4ftpHXZZ/kMF/EvhtzEL6cpUdqeA/tkS6HoCMQknhCbx
3zOYnNKCGQUkFhTKJUSXB6NcZ/950uqkQxAdCPNUTGg1YzkNfbcgTewqKsmb25Cv
/sOUFmrq2AlnWkdH+QWP0BtNFUX9saOSXvxrABT6nfiXSpUeF6Rprcgi9gdoNhAD
mfE5IFw0dOEo2XThZTchKu3FBSMAkDadvC9yWFr88dr62E6EKFPzY3vHLCA8QoT/
zk5beGSjyWGe7FZZJ4CKdO4EWBZr/WSlSVzOfG4ZBVPUoh2AZcUEhzzrzTezzocO
71/5ZVpQ6O8+Pxwyy85Vd2drf7OZLagGNydNx46RHXrRxl+q0c5vFTVh4Txbd4YU
XNsd+kA62/OYyPHbtVzTzAPPKG7aM8hLzdy8dkTgvuDzWHmxFWhD/HgiMHfFrQqs
WCafvBhTc4375dvwYOupxaU2ncHKvt/zQJtBOw6bEwAIUa5c1IkIUr0i8XgRq6lr
x/YvhFIwiVyXVnrDt3ZSIx79Oajf541uJg7vLFyPBQkcnQlJ6T7oy7qJlqhM0567
Sr6G0/YXa1ccIfmKyeh4
=36kx
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-4.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/ARM fixes for v4.5-rc2
A few random fixes, mostly coming from the PMU work by Shannon:
- fix for injecting faults coming from the guest's userspace
- cleanup for our CPTR_EL2 accessors (reserved bits)
- fix for a bug impacting perf (user/kernel discrimination)
- fix for a 32bit sysreg handling bug
The first real batch of fixes for this release cycle, so there are a few more
than usual.
Most of these are fixes and tweaks to board support (DT bugfixes, etc). I've
also picked up a couple of small cleanups that seemed innocent enough that
there was little reason to wait (const/__initconst and Kconfig deps).
Quite a bit of the changes on OMAP were due to fixes to no longer write to
rodata from assembly when ARM_KERNMEM_PERMS was enabled, but there were also
other fixes.
Kirkwood had a bunch of gpio fixes for some boards. OMAP had RTC fixes
on OMAP5, and Nomadik had changes to MMC parameters in DT.
All in all, mostly the usual mix of various fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWt8K0AAoJEIwa5zzehBx3HxsQAJMqKkTCr/2hzHTw5V8sTgDf
zrVYEi5WF5IGLR4eON31rF31tbEmQd0bqVlsTLy/yK3hu1gTQwDyqBJqoEQBMQUW
lBShtVERP3mNUm0yICeupaWIhoRqaymlwFKKfq93f+YTn27pEDQ1ImEHuARlbAKa
3zCd91ClRRm3WxrBXj9srt/NyMX7BlcHLjcN1BurpVkR0aciW1B692Lb8LotEP4k
D1CLNZeQEwV+uOHcJsvjEdB/Uh42+dpsxbIAaBW2cFB0iuX3BsnmferoFe0cXmpC
wO5ffvzr0LCMsrUzUsbvn0RgRtMDi2RxrS1n0cXrAVPP6OEeOaMLwGdPUGvQ2EVI
cvCHpw3qXRz7CTERpy7bv0YugIY3vZPukJrne2ZEH7cpA/JLsuqlKm/cOmPRB7gJ
tC2mXlP5jHbbGRiq/Kk3QB7QsKIxHfIalCZMMiRe0ldWSDW6jDpvrv4Nsfzs3etN
LaB0iIm3f5DqOFjjZi+LVUJUGE3M8/3Fs2f70rCdPKDGq9fTqD3+2mK7l80ZaYXG
J3wPKM+9WXGISakS/biQzvYA9iDnbDZCTUxBIM6VlvcHmARJEH3TS5ZjR0eaIb7w
Sqx7e2ufm/2wpGINDoT1qms14cI8ayj7iq+8fDnI3R9XSXxeKk5J5jo9fKnbnDWP
4A4Ai+NYBv/rDWjkg19s
=1iBu
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"The first real batch of fixes for this release cycle, so there are a
few more than usual.
Most of these are fixes and tweaks to board support (DT bugfixes,
etc). I've also picked up a couple of small cleanups that seemed
innocent enough that there was little reason to wait (const/
__initconst and Kconfig deps).
Quite a bit of the changes on OMAP were due to fixes to no longer
write to rodata from assembly when ARM_KERNMEM_PERMS was enabled, but
there were also other fixes.
Kirkwood had a bunch of gpio fixes for some boards. OMAP had RTC
fixes on OMAP5, and Nomadik had changes to MMC parameters in DT.
All in all, mostly the usual mix of various fixes"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits)
ARM: multi_v7_defconfig: enable DW_WATCHDOG
ARM: nomadik: fix up SD/MMC DT settings
ARM64: tegra: Add chosen node for tegra132 norrin
ARM: realview: use "depends on" instead of "if" after prompt
ARM: tango: use "depends on" instead of "if" after prompt
ARM: tango: use const and __initconst for smp_operations
ARM: realview: use const and __initconst for smp_operations
bus: uniphier-system-bus: revive tristate prompt
arm64: dts: Add missing DMA Abort interrupt to Juno
bus: vexpress-config: Add missing of_node_put
ARM: dts: am57xx: sbc-am57x: correct Eth PHY settings
ARM: dts: am57xx: cl-som-am57x: fix CPSW EMAC pinmux
ARM: dts: am57xx: sbc-am57x: fix UART3 pinmux
ARM: dts: am57xx: cl-som-am57x: update SPI Flash frequency
ARM: dts: am57xx: cl-som-am57x: set HOST mode for USB2
ARM: dts: am57xx: sbc-am57x: fix SB-SOM EEPROM I2C address
ARM: dts: LogicPD Torpedo: Revert Duplicative Entries
ARM: dts: am437x: pixcir_tangoc: use correct flags for irq types
ARM: dts: am4372: fix irq type for arm twd and global timer
ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type
...
Pull mailbox fixes from Jassi Brar:
- fix getting element from the pcc-channels array by simply indexing
into it
- prevent building mailbox-test driver for archs that don't have IOMEM
* 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: Fix dependencies for !HAS_IOMEM archs
mailbox: pcc: fix channel calculation in get_pcc_channel()
Here are some USB fixes for 4.5-rc3.
The usual, xhci fixes for reported issues, combined with some small
gadget driver fixes, and a MAINTAINERS file update. All have been in
linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAla205wACgkQMUfUDdst+ymVcACgksg9fcSsphvjo4ruxN79QrjW
GaEAn3VmLc2WyLf13on9cXmBi8KrS+WY
=sFPR
-----END PGP SIGNATURE-----
Merge tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some USB fixes for 4.5-rc3.
The usual, xhci fixes for reported issues, combined with some small
gadget driver fixes, and a MAINTAINERS file update. All have been in
linux-next with no reported issues"
* tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
xhci: harden xhci_find_next_ext_cap against device removal
xhci: Fix list corruption in urb dequeue at host removal
usb: host: xhci-plat: fix NULL pointer in probe for device tree case
usb: xhci-mtk: fix AHB bus hang up caused by roothubs polling
usb: xhci-mtk: fix bpkts value of LS/HS periodic eps not behind TT
usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Broxton-M platforms
usb: xhci: set SSIC port unused only if xhci_suspend succeeds
usb: xhci: add a quirk bit for ssic port unused
usb: xhci: handle both SSIC ports in PME stuck quirk
usb: dwc3: gadget: set the OTG flag in dwc3 gadget driver.
Revert "xhci: don't finish a TD if we get a short-transfer event mid TD"
MAINTAINERS: fix my email address
usb: dwc2: Fix probe problem on bcm2835
Revert "usb: dwc2: Move reset into dwc2_get_hwparams()"
usb: musb: ux500: Fix NULL pointer dereference at system PM
usb: phy: mxs: declare variable with initialized value
usb: phy: msm: fix error handling in probe.
Here are some IIO and staging driver fixes for 4.5-rc3. All of them,
except one, are for IIO drivers, and one is for a speakup driver fix
caused by some earlier patches, to resolve a reported build failure.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAla21D8ACgkQMUfUDdst+ykHiQCgqYizz86HYi3ZR0LRgGwm3Gws
4sEAn2x230E41WXDhLTLT3N6fAaetYl9
=wa/Y
-----END PGP SIGNATURE-----
Merge tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging and IIO driver fixes from Greg KH:
"Here are some IIO and staging driver fixes for 4.5-rc3.
All of them, except one, are for IIO drivers, and one is for a speakup
driver fix caused by some earlier patches, to resolve a reported build
failure"
* tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Staging: speakup: Fix allyesconfig build on mn10300
iio: dht11: Use boottime
iio: ade7753: avoid uninitialized data
iio: pressure: mpl115: fix temperature offset sign
iio: imu: Fix dependencies for !HAS_IOMEM archs
staging: iio: Fix dependencies for !HAS_IOMEM archs
iio: adc: Fix dependencies for !HAS_IOMEM archs
iio: inkern: fix a NULL dereference on error
iio:adc:ti_am335x_adc Fix buffered mode by identifying as software buffer.
iio: light: acpi-als: Report data as processed
iio: dac: mcp4725: set iio name property in sysfs
iio: add HAS_IOMEM dependency to VF610_ADC
iio: add IIO_TRIGGER dependency to STK8BA50
iio: proximity: lidar: correct return value
iio-light: Use a signed return type for ltr501_match_samp_freq()
Merge fixes from Andrew Morton:
"22 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT
radix-tree: fix oops after radix_tree_iter_retry
MAINTAINERS: trim the file triggers for ABI/API
dax: dirty inode only if required
thp: make deferred_split_scan() work again
mm: replace vma_lock_anon_vma with anon_vma_lock_read/write
ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup
um: asm/page.h: remove the pte_high member from struct pte_t
mm, hugetlb: don't require CMA for runtime gigantic pages
mm/hugetlb: fix gigantic page initialization/allocation
mm: downgrade VM_BUG in isolate_lru_page() to warning
mempolicy: do not try to queue pages from !vma_migratable()
mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress
vmstat: make vmstat_update deferrable
mm, vmstat: make quiet_vmstat lighter
mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT
memblock: don't mark memblock_phys_mem_size() as __init
dump_stack: avoid potential deadlocks
mm: validate_mm browse_rb SMP race condition
m32r: fix build failure due to SMP and MMU
...
Pull Ceph fixes from Sage Weil:
"We have a few wire protocol compatibility fixes, ports of a few recent
CRUSH mapping changes, and a couple error path fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: MOSDOpReply v7 encoding
libceph: advertise support for TUNABLES5
crush: decode and initialize chooseleaf_stable
crush: add chooseleaf_stable tunable
crush: ensure take bucket value is valid
crush: ensure bucket id is valid before indexing buckets array
ceph: fix snap context leak in error path
ceph: checking for IS_ERR instead of NULL
Pull drm fixes from Dave Airlie:
"Fixes all over the place:
- amdkfd: two static checker fixes
- mst: a bunch of static checker and spec/hw interaction fixes
- amdgpu: fix Iceland hw properly, and some fiji bugs, along with
some write-combining fixes.
- exynos: some regression fixes
- adv7511: fix some EDID reading issues"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (38 commits)
drm/dp/mst: deallocate payload on port destruction
drm/dp/mst: Reverse order of MST enable and clearing VC payload table.
drm/dp/mst: move GUID storage from mgr, port to only mst branch
drm/dp/mst: change MST detection scheme
drm/dp/mst: Calculate MST PBN with 31.32 fixed point
drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil
drm/mst: Add range check for max_payloads during init
drm/mst: Don't ignore the MST PBN self-test result
drm: fix missing reference counting decrease
drm/amdgpu: disable uvd and vce clockgating on Fiji
drm/amdgpu: remove exp hardware support from iceland
drm/amdgpu: load MEC ucode manually on iceland
drm/amdgpu: don't load MEC2 on topaz
drm/amdgpu: drop topaz support from gmc8 module
drm/amdgpu: pull topaz gmc bits into gmc_v7
drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above
drm/amdgpu: iceland use CI based MC IP
drm/amdgpu: move gmc7 support out of CIK dependency
drm/amdgpu/gfx7: enable cp inst/reg error interrupts
drm/amdgpu/gfx8: enable cp inst/reg error interrupts
...
- PM core fix to avoid false-positive warnings generated when
the pm_domain field is cleared for a device that appears to
be bound to a driver (Rafael Wysocki).
- New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer).
- Fix for an "unused function" compiler warning in the generic
power domains framework (Ulf Hansson).
- Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set
the PM domain pointer of a device properly in one place that
was overlooked by a recent PM core update (Andy Shevchenko).
- Removal of a redundant function declaration in the ACPI CPPC
core code (Timur Tabi).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJWtTQKAAoJEILEb/54YlRxQboP+gOgld+AZxMTe/88umM92utO
M9PqzJhFGQ1kCUQliJhcgB2eggwXQwb8dKo+THv8Ql8DEzPLERcRMD/jffdRj2vQ
3LQBVkD48nGwu6HOMZ0+e/61AkSkORhq9W7W3wkLcNdUhmetHpoG9+By/5ekf9WR
BEDdvdEOyWHx044GWIgennaJC7wrmzrzxbjiVW1+aD0p/yRd2eFqNGSFcc2U4hoG
gylay5sGJ/XU7v6kYg3mr04w+UXOkR/TrbsKheg0BTw/RJzgO69rZnrpZkjymHb0
98XMNwGu+W2E1lRaKXBqctpuGD5NDOvuZIlqow5JBQXVLZbHE8gkGXJDET72cwRB
wBPwiiYmdMxZrFU7pwP8FeeX39EDlWCQw4ji4DPqk8VAFjKu6HnHRd0K9vgJ46Gh
YEGIQVvnuPB7DC3dPpa+FBaL1uIysZSri6/hTJQ9i7oz+WSoW1oNfatBtMtBFKvs
BZ31QUq2rOf+fjjRiggHmSu/gzHd0ydOKaI3awm1aNJ9wP/J4xj1wBmhrIx9Tbxw
NXu3wyf1LpRA4Z3s8TUIygWMAPhdoBW9TzgrIlzf26SB42CY+1ftub1xkLqTqTmU
G1lQcc9wFdYvwUQVAy1Gq1iuwUM/V/UEKZVcttG8yEe5w0qKDdjARNqPFTGcEF0G
LqLTypyGztWfPOxHs74m
=DFSF
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"These are: a fix for a recently introduced false-positive warnings
about PM domain pointers being changed inappropriately (harmless but
annoying), an MCH size workaround quirk for one more platform, a
compiler warning fix (generic power domains framework), an ACPI LPSS
(Intel SoCs) driver fixup and a cleanup of the ACPI CPPC core code.
Specifics:
- PM core fix to avoid false-positive warnings generated when the
pm_domain field is cleared for a device that appears to be bound to
a driver (Rafael Wysocki).
- New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer).
- Fix for an "unused function" compiler warning in the generic power
domains framework (Ulf Hansson).
- Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set the PM
domain pointer of a device properly in one place that was
overlooked by a recent PM core update (Andy Shevchenko).
- Removal of a redundant function declaration in the ACPI CPPC core
code (Timur Tabi)"
* tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: Avoid false-positive warnings in dev_pm_domain_set()
PM / Domains: Silence compiler warning for an unused function
ACPI / CPPC: remove redundant mbox_send_message() declaration
ACPI / LPSS: set PM domain via helper setter
PNP: Add Haswell-ULT to Intel MCH size workaround
In the current implementation of the EPOLLEXCLUSIVE flag (added for
4.5-rc1), if epoll waiters create different POLL* sets and register them
as exclusive against the same target fd, the current implementation will
stop waking any further waiters once it finds the first idle waiter.
This means that waiters could miss wakeups in certain cases.
For example, when we wake up a pipe for reading we do:
wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM); So if
one epoll set or epfd is added to pipe p with POLLIN and a second set
epfd2 is added to pipe p with POLLRDNORM, only epfd may receive the
wakeup since the current implementation will stop after it finds any
intersection of events with a waiter that is blocked in epoll_wait().
We could potentially address this by requiring all epoll waiters that
are added to p be required to pass the same set of POLL* events. IE the
first EPOLL_CTL_ADD that passes EPOLLEXCLUSIVE establishes the set POLL*
flags to be used by any other epfds that are added as EPOLLEXCLUSIVE.
However, I think it might be somewhat confusing interface as we would
have to reference count the number of users for that set, and so
userspace would have to keep track of that count, or we would need a
more involved interface. It also adds some shared state that we'd have
store somewhere. I don't think anybody will want to bloat
__wait_queue_head for this.
I think what we could do instead, is to simply restrict EPOLLEXCLUSIVE
such that it can only be specified with EPOLLIN and/or EPOLLOUT. So
that way if the wakeup includes 'POLLIN' and not 'POLLOUT', we can stop
once we hit the first idle waiter that specifies the EPOLLIN bit, since
any remaining waiters that only have 'POLLOUT' set wouldn't need to be
woken. Likewise, we can do the same thing if 'POLLOUT' is in the wakeup
bit set and not 'POLLIN'. If both 'POLLOUT' and 'POLLIN' are set in the
wake bit set (there is at least one example of this I saw in fs/pipe.c),
then we just wake the entire exclusive list. Having both 'POLLOUT' and
'POLLIN' both set should not be on any performance critical path, so I
think that's ok (in fs/pipe.c its in pipe_release()). We also continue
to include EPOLLERR and EPOLLHUP by default in any exclusive set. Thus,
the user can specify EPOLLERR and/or EPOLLHUP but is not required to do
so.
Since epoll waiters may be interested in other events as well besides
EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP, these can still be added by
doing a 'dup' call on the target fd and adding that as one normally
would with EPOLL_CTL_ADD. Since I think that the POLLIN and POLLOUT
events are what we are interest in balancing, I think that the 'dup'
thing could perhaps be added to only one of the waiter threads.
However, I think that EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP should be
sufficient for the majority of use-cases.
Since EPOLLEXCLUSIVE is intended to be used with a target fd shared
among multiple epfds, where between 1 and n of the epfds may receive an
event, it does not satisfy the semantics of EPOLLONESHOT where only 1
epfd would get an event. Thus, it is not allowed to be specified in
conjunction with EPOLLEXCLUSIVE.
EPOLL_CTL_MOD is also not allowed if the fd was previously added as
EPOLLEXCLUSIVE. It seems with the limited number of flags to not be as
interesting, but this could be relaxed at some further point.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Tested-by: Madars Vitolins <m@silodev.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@ftp.linux.org.uk>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Helper radix_tree_iter_retry() resets next_index to the current index.
In following radix_tree_next_slot current chunk size becomes zero. This
isn't checked and it tries to dereference null pointer in slot.
Tagged iterator is fine because retry happens only at slot 0 where tag
bitmask in iter->tags is filled with single bit.
Fixes: 46437f9a55 ("radix-tree: fix race in gang lookup")
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Jeremiah Mahler <jmmahler@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>