If the io_subchannel_driver is unbound from a subchannel it bluntly kills
all I/O on the subchannel and sets the ccw_device state to not operable
before deregistering the ccw_device. However, for online devices we should
set the device offline (disband path groups etc.) which does not happen if
the device is in not oper state.
Simply deregister the ccw device - ccw_device_remove is smart enough to set
the device offline properly. If everything fails call io_subchannel_quiesce
afterwards as a safeguard.
Reported-by: Shalini Chellathurai Saroja <shalini@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Acked-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Get rid of the confusing two-stage translation in a hot path, and only
handle CCQs that we anticipate for the respective command. Any
unexpected value (such as CCQ 97 (rc == 1) for SQBS) should be
considered a severe HW/driver bug, and traced as such.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Immediate retry of EQBS after CCQ 96 means that we potentially misreport
the state of buffers inspected during the first EQBS call.
This occurs when
1. the first EQBS finds all inspected buffers still in the initial state
set by the driver (ie INPUT EMPTY or OUTPUT PRIMED),
2. the EQBS terminates early with CCQ 96, and
3. by the time that the second EQBS comes around, the state of those
previously inspected buffers has changed.
If the state reported by the second EQBS is 'driver-owned', all we know
is that the previous buffers are driver-owned now as well. But we can't
tell if they all have the same state. So for instance
- the second EQBS reports OUTPUT EMPTY, but any number of the previous
buffers could be OUTPUT ERROR by now,
- the second EQBS reports OUTPUT ERROR, but any number of the previous
buffers could be OUTPUT EMPTY by now.
Effectively, this can result in both over- and underreporting of errors.
If the state reported by the second EQBS is 'HW-owned', that doesn't
guarantee that the previous buffers have not been switched to
driver-owned in the mean time. So for instance
- the second EQBS reports INPUT EMPTY, but any number of the previous
buffers could be INPUT PRIMED (or INPUT ERROR) by now.
This would result in failure to process pending work on the queue. If
it's the final check before yielding initiative, this can cause
a (temporary) queue stall due to IRQ avoidance.
Fixes: 25f269f173 ("[S390] qdio: EQBS retry after CCQ 96")
Cc: <stable@vger.kernel.org> #v3.2+
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Only attempt to merge PENDING into EMPTY buffers for devices where
the PENDING state is actually expected (ie. IQD with CQ).
This might speed up the hot path a little bit.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
On an Output queue, both EMPTY and PENDING buffer states imply that the
buffer is ready for completion-processing by the upper-layer drivers.
So for a non-QEBSM Output queue, get_buf_states() merges mixed
batches of PENDING and EMPTY buffers into one large batch of EMPTY
buffers. The upper-layer driver (ie. qeth) later distuingishes PENDING
from EMPTY by inspecting the slsb_state for
QDIO_OUTBUF_STATE_FLAG_PENDING.
But the merge logic in get_buf_states() contains a bug that causes us to
erronously also merge ERROR buffers into such a batch of EMPTY buffers
(ERROR is 0xaf, EMPTY is 0xa1; so ERROR & EMPTY == EMPTY).
Effectively, most outbound ERROR buffers are currently discarded
silently and processed as if they had succeeded.
Note that this affects _all_ non-QEBSM device types, not just IQD with CQ.
Fix it by explicitly spelling out the exact conditions for merging.
For extracting the "get initial state" part out of the loop, this relies
on the fact that get_buf_states() is never called with a count of 0. The
QEBSM path already strictly requires this, and the two callers with
variable 'count' make sure of it.
Fixes: 104ea556ee ("qdio: support asynchronous delivery of storage blocks")
Cc: <stable@vger.kernel.org> #v3.2+
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When determining the buffer count that get_buf_states() should
be queried for, 'count' is capped at 127 buffers.
So the check
q->first_to_check == (q->first_to_check + count) % 128
can be reduced to
count == 0
This helps to emphasize that get_buf_states() is really only
called with count > 0.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
(which we don't support).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEw9DWbcNiT/aowBjO3s9rk8bwL68FAlqdNd4SHGNvaHVja0By
ZWRoYXQuY29tAAoJEN7Pa5PG8C+v+UgP/2LAEaCqk+IH+9m3VzofJYXndawq0RNL
1z5sWsptlkFS6KkDNDty2utHpELLT3ohwLtkAJPcEf3QSRhuHWOcTswM0ZZi9f5k
VI/j1gbz4CrkzrY1wHNUghIxmAo2KuNPcRjxSA7F2fvDgic0jIjFE2u99EvV72F/
Q1UHll6UAoDGz4fGUeyo5aGArmUT9BdrsurH4fUNi4eX2O2wzaqUqKhZaJkjJmiS
M4Glqc5nNkChXE3tPpX2J1WXNXv0m1Q6ysa7pieuiHxUqrlXX1bDUR6ddEtCdCV4
6601TDhw7LKDmuG8Z8ou5J0YmII0VVCYyWxSeJ7gmNxiHqLdFF5s4zG48foMzV1A
qK1kYjkKkYvCMhYWOw3mEFqoma6OR97GWQ20OyEku8DFnK2lY8TAH28Ea1uud9IJ
eztmJSSgHHBjOkJH7cDAJTFLPSWDQSLthLM45dpSQri6TMAUkeqSBbSFI0Kw0GLs
hX5er1MItNzLogDrfWlsegib99ApwVa7lFN1aQz2xxBMXPRmjOEV1SSB4VPwMJWZ
8UXlBEwUCys68RxxuiT6tkMER0Fp3ly16EG7xkILh5zyQeTzh+sVsqMX9+yWDUiV
F94+wJmNDHA+wvgxMXpVPkEDhnsODHSVLv4L1PIMIyVxDJUMwRRCHh3wMQO2ZVVo
J+oDbTJEsT+f
=cJH0
-----END PGP SIGNATURE-----
Merge tag 'vfio-ccw-20180305' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features
Pull vfio-ccw patches from Cornelia Huck:
A small documentation update, and reject transport mode requests
(which we don't support).
commit 8f50af49f5 ("s390/console: Make preferred console handling
more consistent") created a separate console state for the ascii
console. This has the side effect that we register no tty for the line
mode interface as soon as there an ascii interface as default console.
Under KVM this results in no getty program on the line mode tty if the
guest has both types of interfaces.
As we can have multiple ttys at the same time we do not want to disable
the tty on sclp_line0 under KVM. So instead of checking for the console
mode, we now check for the presence of the sclp line mode interface. As
z/VM multiplexes the line mode interface on the 32xx screen we continue
to disable the line mode tty for the z/VM case.
CC: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Fixes: 8f50af49f5 ("s390/console: Make preferred console handling more consistent")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The Linux Virtual Terminal (VT) layer provides a default keymap
which is compiled when VT layer is enabled. But at the same time
we are also compiling the EBCDIC keymap and this causes the linker
to complain.
So let's rename the EBCDIC keymap variables to prevent linker
conflict.
Signed-off-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <f670a2698d2372e1e990c48a29334ffe894804b1.1519315352.git.alifm@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
vfio-ccw only supports command mode for channel programs, not transport
mode. User space is supposed to already take care of that and pass us
command-mode ORBs only, but better make sure and return an error to
the caller instead of trying to process tcws as ccws.
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Set the XRC timestamps even if XRC is not supported by the storage server
to help debugging the storage server firmware.
Do not advertise valid time stamps if the system time could not be
obtained.
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reported by smatch that the usage of cqr->block is inconsistent.
The sanity check is not needed because _dasd_requeue_request already
checks for a valid cqr->block pointer and all referenced ERP requests
have a valid cqr->block pointer as well since it is copied during ERP
process.
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Change the size of the sclp mask to 64 bits.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Qemu before version 2.11 does not implement the architecture correctly,
and does not allow for a mask size of size different than 4.
This patch introduces a compatibility mode for such systems, forcing
the mask sizes to 4.
Since the mask size is currently still 4 anyway, this patch should have
no impact whatsoever by itself, but it will be needed when the mask size
is increased to 64 bits in the next patch.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Switch the layout of the event masks to be a generic buffer, and
implement accessors to retrieve the values of the masks.
This will be needed in the next patches, where we will eventually switch
the mask size to 64 bits.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Replace hardcoded instances where 32 or unsigned int (or long) is used
for SCLP event masks, and replace with sizeof(sccb_mask_t) and
sccb_mask_t respectively.
This improves readability and prepares for when we will increase
sccb_mask_t to 64 bits.
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Make the behavior in case of constant IFCC/CCC errors configurable.
Add a sysfs attribute to switch between path disabled after threshold
exceeded (default) and message only.
Reviewed-by: Jan Hoeppner <hoeppner@linux.vnet.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add functions to retrieve data associated with an SCLP Store Data
entity. Automatically retrieve data for the "config" entity during
boot and make that data available to user-space via sysfs:
/sys/firmware/sclp_sd/config/data
Reading from this file will return config data contents.
/sys/firmware/sclp_sd/config/reload
Writing to this file will cause the latest version of data
related to the config entity to be read from the SCLP interface.
Generate a KOBJ_CHANGE whenever new data is retrieved.
Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When we terminate driver I/O (because we need to stop using a certain
channel path) we also need to ensure that a timer (which may have been
set up using ccw_device_start_timeout) is cleared.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When a timeout occurs for users of ccw_device_start_timeout
we will stop the IO and call the drivers int handler with
the irb pointer set to ERR_PTR(-ETIMEDOUT). Sometimes
however we'd set the irb pointer to ERR_PTR(-EIO) which is
not intended. Just set the correct value in all codepaths.
Reported-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
There are cases a device driver can't start IO because the device is
currently in use by cio. In this case the device driver is notified
when the device is usable again.
Using ccw_device_start_timeout we would set the timeout (and change
an existing timeout) before we test for internal usage. Worst case
this could lead to an unexpected timer deletion.
Fix this by setting the timeout after we test for internal usage.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Internal DASD device driver I/O such as query host access count or
path verification is started using the _sleep_on() function.
To mark a request as started or ended the callback_data is set to either
DASD_SLEEPON_START_TAG or DASD_SLEEPON_END_TAG.
In cases where the request has to be stopped unconditionally the status is
set to DASD_SLEEPON_END_TAG as well which leads to immediate clearing of
the request.
But the request might still be on a device request queue for normal
operation which might lead to a panic because of a BUG() statement in
__dasd_device_process_final_queue() or a list corruption of the device
request queue.
Fix by removing the setting of DASD_SLEEPON_END_TAG in the
dasd_cancel_req() and dasd_generic_requeue_all_requests() functions and
ensure that the request is not deleted in the requeue function.
Trigger the device tasklet in the requeue function and let the normal
processing cleanup the request.
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This includes a bugfix for virtio 9p fs.
It also fixes hybernation for s390 guests with virtio devices.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJahJ2jAAoJECgfDbjSjVRpGbsIAKvK50iQK/5Qe0X78DCv/9pW
gVkW29bAKG8D8JcI/EViLFW3IgeDM1a2fcbCoSiOTzydAf6nMTI0vZqURopQbXKC
teJw7PwHjaZ9Y3IL/mzMODrhZvZrl9iI2yAQoZqoeCeaX76t5k8kYB35U4Uuiw7Y
gKWOpuOPEZx2mKrPCmIN2X0VrETJz122bNyb5DB+V4oLAx/9PolGGiGBmyu61pv/
Fx1PQ6at8/M+74tFeeFwKbuUf5GmdanqPVCZlJJPKa2acaRtBFhI01OhBMIxAUYj
9+1dzp5E4KjmGbz7Fd3dsleRLCV/q4E8gDWLbZVx4p2vVsp7edxk/29kcSVrhlE=
=c8q9
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"This includes a bugfix for virtio 9p fs. It also fixes hybernation for
s390 guests with virtio devices"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio/s390: implement PM operations for virtio_ccw
9p/trans_virtio: discard zero-length reply
Suspend/Resume to/from disk currently fails. Let us wire
up the necessary callbacks. This is mostly just forwarding
the requests to the virtio drivers. The only thing that
has to be done in virtio_ccw itself is to re-set the
virtio revision.
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20171207141102.70190-2-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[CH: merged <20171218083706.223836-1-borntraeger@de.ibm.com> to fix
!CONFIG_PM configs]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:
for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
done
with de-mangling cleanups yet to come.
NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do. But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.
The next patch from Al will sort out the final differences, and we
should be all done.
Scripted-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ARM:
- Include icache invalidation optimizations, improving VM startup time
- Support for forwarded level-triggered interrupts, improving
performance for timers and passthrough platform devices
- A small fix for power-management notifiers, and some cosmetic changes
PPC:
- Add MMIO emulation for vector loads and stores
- Allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
requiring the complex thread synchronization of older CPU versions
- Improve the handling of escalation interrupts with the XIVE interrupt
controller
- Support decrement register migration
- Various cleanups and bugfixes.
s390:
- Cornelia Huck passed maintainership to Janosch Frank
- Exitless interrupts for emulated devices
- Cleanup of cpuflag handling
- kvm_stat counter improvements
- VSIE improvements
- mm cleanup
x86:
- Hypervisor part of SEV
- UMIP, RDPID, and MSR_SMI_COUNT emulation
- Paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit
- Allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more AVX512
features
- Show vcpu id in its anonymous inode name
- Many fixes and cleanups
- Per-VCPU MSR bitmaps (already merged through x86/pti branch)
- Stable KVM clock when nesting on Hyper-V (merged through x86/hyperv)
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJafvMtAAoJEED/6hsPKofo6YcH/Rzf2RmshrWaC3q82yfIV0Qz
Z8N8yJHSaSdc3Jo6cmiVj0zelwAxdQcyjwlT7vxt5SL2yML+/Q0st9Hc3EgGGXPm
Il99eJEl+2MYpZgYZqV8ff3mHS5s5Jms+7BITAeh6Rgt+DyNbykEAvzt+MCHK9cP
xtsIZQlvRF7HIrpOlaRzOPp3sK2/MDZJ1RBE7wYItK3CUAmsHim/LVYKzZkRTij3
/9b4LP1yMMbziG+Yxt1o682EwJB5YIat6fmDG9uFeEVI5rWWN7WFubqs8gCjYy/p
FX+BjpOdgTRnX+1m9GIj0Jlc/HKMXryDfSZS07Zy4FbGEwSiI5SfKECub4mDhuE=
=C/uD
-----END PGP SIGNATURE-----
Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Radim Krčmář:
"ARM:
- icache invalidation optimizations, improving VM startup time
- support for forwarded level-triggered interrupts, improving
performance for timers and passthrough platform devices
- a small fix for power-management notifiers, and some cosmetic
changes
PPC:
- add MMIO emulation for vector loads and stores
- allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
requiring the complex thread synchronization of older CPU versions
- improve the handling of escalation interrupts with the XIVE
interrupt controller
- support decrement register migration
- various cleanups and bugfixes.
s390:
- Cornelia Huck passed maintainership to Janosch Frank
- exitless interrupts for emulated devices
- cleanup of cpuflag handling
- kvm_stat counter improvements
- VSIE improvements
- mm cleanup
x86:
- hypervisor part of SEV
- UMIP, RDPID, and MSR_SMI_COUNT emulation
- paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit
- allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more
AVX512 features
- show vcpu id in its anonymous inode name
- many fixes and cleanups
- per-VCPU MSR bitmaps (already merged through x86/pti branch)
- stable KVM clock when nesting on Hyper-V (merged through
x86/hyperv)"
* tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits)
KVM: PPC: Book3S: Add MMIO emulation for VMX instructions
KVM: PPC: Book3S HV: Branch inside feature section
KVM: PPC: Book3S HV: Make HPT resizing work on POWER9
KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code
KVM: PPC: Book3S PR: Fix broken select due to misspelling
KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs()
KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled
KVM: PPC: Book3S HV: Drop locks before reading guest memory
kvm: x86: remove efer_reload entry in kvm_vcpu_stat
KVM: x86: AMD Processor Topology Information
x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested
kvm: embed vcpu id to dentry of vcpu anon inode
kvm: Map PFN-type memory regions as writable (if possible)
x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n
KVM: arm/arm64: Fixup userspace irqchip static key optimization
KVM: arm/arm64: Fix userspace_irqchip_in_use counting
KVM: arm/arm64: Fix incorrect timer_is_pending logic
MAINTAINERS: update KVM/s390 maintainers
MAINTAINERS: add Halil as additional vfio-ccw maintainer
MAINTAINERS: add David as a reviewer for KVM/s390
...
Pull networking fixes from David Miller:
1) Make allocations less aggressive in x_tables, from Minchal Hocko.
2) Fix netfilter flowtable Kconfig deps, from Pablo Neira Ayuso.
3) Fix connection loss problems in rtlwifi, from Larry Finger.
4) Correct DRAM dump length for some chips in ath10k driver, from Yu
Wang.
5) Fix ABORT handling in rxrpc, from David Howells.
6) Add SPDX tags to Sun networking drivers, from Shannon Nelson.
7) Some ipv6 onlink handling fixes, from David Ahern.
8) Netem packet scheduler interval calcualtion fix from Md. Islam.
9) Don't put crypto buffers on-stack in rxrpc, from David Howells.
10) Fix handling of error non-delivery status in netlink multicast
delivery over multiple namespaces, from Nicolas Dichtel.
11) Missing xdp flush in tuntap driver, from Jason Wang.
12) Synchonize RDS protocol netns/module teardown with rds object
management, from Sowini Varadhan.
13) Add nospec annotations to mpls, from Dan Williams.
14) Fix SKB truesize handling in TIPC, from Hoang Le.
15) Interrupt masking fixes in stammc from Niklas Cassel.
16) Don't allow ptr_ring objects to be sized outside of kmalloc's
limits, from Jason Wang.
17) Don't allow SCTP chunks to be built which will have a length
exceeding the chunk header's 16-bit length field, from Alexey
Kodanev.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (82 commits)
ibmvnic: Remove skb->protocol checks in ibmvnic_xmit
bpf: fix rlimit in reuseport net selftest
sctp: verify size of a new chunk in _sctp_make_chunk()
s390/qeth: fix SETIP command handling
s390/qeth: fix underestimated count of buffer elements
ptr_ring: try vmalloc() when kmalloc() fails
ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE
net: stmmac: remove redundant enable of PMT irq
net: stmmac: rename GMAC_INT_DEFAULT_MASK for dwmac4
net: stmmac: discard disabled flags in interrupt status register
ibmvnic: Reset long term map ID counter
tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
selftests/bpf: add selftest that use test_libbpf_open
selftests/bpf: add test program for loading BPF ELF files
tools/libbpf: improve the pr_debug statements to contain section numbers
bpf: Sync kernel ABI header with tooling header for bpf_common.h
net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT
net: thunder: change q_len's type to handle max ring size
tipc: fix skb truesize/datasize ratio control
net/sched: cls_u32: fix cls_u32 on filter replace
...
send_control_data() applies some special handling to SETIP v4 IPA
commands. But current code parses *all* command types for the SETIP
command code. Limit the command code check to IPA commands.
Fixes: 5b54e16f1a ("qeth: do not spin for SETIP ip assist command")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For a memory range/skb where the last byte falls onto a page boundary
(ie. 'end' is of the form xxx...xxx001), the PFN_UP() part of the
calculation currently doesn't round up to the next PFN due to an
off-by-one error.
Thus qeth believes that the skb occupies one page less than it
actually does, and may select a IO buffer that doesn't have enough spare
buffer elements to fit all of the skb's data.
HW detects this as a malformed buffer descriptor, and raises an
exception which then triggers device recovery.
Fixes: 2863c61334 ("qeth: refactor calculation of SBALE count")
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull s390 updates from Heiko Carstens:
"The main thing in this merge is the defense for the Spectre
vulnerabilities. But there are other updates as well, the changes in
more detail:
- An s390 specific implementation of the array_index_mask_nospec
function to the defense against spectre v1
- Two patches to utilize the new PPA-12/PPA-13 instructions to run
the kernel and/or user space with reduced branch predicton.
- The s390 variant of the 'retpoline' spectre v2 defense called
'expoline'. There is no return instruction for s390, instead an
indirect branch is used for function return
The s390 defense mechanism for indirect branches works by using an
execute-type instruction with the indirect branch as the target of
the execute. In effect that turns off the prediction for the
indirect branch.
- Scrub registers in entry.S that contain user controlled values to
prevent the speculative use of these values.
- Re-add the second parameter for the s390 specific runtime
instrumentation system call and move the header file to uapi. The
second parameter will continue to do nothing but older kernel
versions only accepted valid real-time signal numbers. The details
will be documented in the man-page for the system call.
- Corrections and improvements for the s390 specific documentation
- Add a line to /proc/sysinfo to display the CPU model dependent
license-internal-code identifier
- A header file include fix for eadm.
- An error message fix in the kprobes code.
- The removal of an outdated ARCH_xxx select statement"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/kconfig: Remove ARCH_WANTS_PROT_NUMA_PROT_NONE select
s390: introduce execute-trampolines for branches
s390: run user space and KVM guests with modified branch prediction
s390: add options to change branch prediction behaviour for the kernel
s390/alternative: use a copy of the facility bit mask
s390: add optimized array_index_mask_nospec
s390: scrub registers on kernel entry and KVM exit
s390/cio: fix kernel-doc usage
s390/runtime_instrumentation: re-add signum system call parameter
s390/cpum_cf: correct counter number of LAST_HOST_TRANSLATIONS
s390/kprobes: Fix %p uses in error messages
s390/runtime instrumentation: provide uapi header file
s390/sysinfo: add and display licensed internal code identifier
s390/docs: reword airq section
s390/docs: mention subchannel types
s390/cmf: fix kerneldoc
s390/eadm: fix CONFIG_BLOCK include dependency
Add CONFIG_EXPOLINE to enable the use of the new -mindirect-branch= and
-mfunction_return= compiler options to create a kernel fortified against
the specte v2 attack.
With CONFIG_EXPOLINE=y all indirect branches will be issued with an
execute type instruction. For z10 or newer the EXRL instruction will
be used, for older machines the EX instruction. The typical indirect
call
basr %r14,%r1
is replaced with a PC relative call to a new thunk
brasl %r14,__s390x_indirect_jump_r1
The thunk contains the EXRL/EX instruction to the indirect branch
__s390x_indirect_jump_r1:
exrl 0,0f
j .
0: br %r1
The detour via the execute type instruction has a performance impact.
To get rid of the detour the new kernel parameter "nospectre_v2" and
"spectre_v2=[on,off,auto]" can be used. If the parameter is specified
the kernel and module code will be patched at runtime.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* Require struct page by default for filesystem DAX to remove a number of
surprising failure cases. This includes failures with direct I/O, gdb and
fork(2).
* Add support for the new Platform Capabilities Structure added to the NFIT in
ACPI 6.2a. This new table tells us whether the platform supports flushing
of CPU and memory controller caches on unexpected power loss events.
* Revamp vmem_altmap and dev_pagemap handling to clean up code and better
support future future PCI P2P uses.
* Deprecate the ND_IOCTL_SMART_THRESHOLD command whose payload has become
out-of-sync with recent versions of the NVDIMM_FAMILY_INTEL spec, and
instead rely on the generic ND_CMD_CALL approach used by the two other IOCTL
families, NVDIMM_FAMILY_{HPE,MSFT}.
* Enhance nfit_test so we can test some of the new things added in version 1.6
of the DSM specification. This includes testing firmware download and
simulating the Last Shutdown State (LSS) status.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJaeOg0AAoJEJ/BjXdf9fLBAFoQAI/IgcgJ2h9lfEpgjBRTC44t
2p8dxwT1Ofw3Y1aR/tI8nYRXjRtAGuP4UIeRVnb1CL/N7PagJyoMGU+6hmzg+ptY
c7cEDvw6nZOhrFwXx/xn7R53sYG8zH+UE6+jTR/PP/G4mQJfFCg4iF9R72Y7z0n7
aurf82Kz137NPUy6dNr4V9bmPMJWAaOci9WOj5SKddR5ZSNbjoxylTwQRvre5y4r
7HQTScEkirABOdSf1JoXTSUXCH/RC9UFFXR03ScHstGb1HjCj3KdcicVc50Q++Ub
qsEudhE6i44PEW1Hh4Qkg6hjHMEa8qHP+ShBuRuVaUmlghYTQn66niJAYLZilwdz
EVjE7vR+toHA5g3YCalEmYVutUEhIDkh/xfpd7vM6ZorUGJy95a2elEJs2fHBffC
gEhnCip7FROPcK5RDNUM8hBgnG/q5wwWPQMKY+6rKDZQx3mXssCrKp2Vlx7kBwMG
rpblkEpYjPonbLEHxsSU8yTg9Uq55ciIWgnOToffcjZvjbihi8WUVlHcwHUMPf/o
DWElg+4qmG0Sdd4S2NeAGwTl1Ewrf2RrtUGMjHtH4OUFs1wo6ZmfrxFzzMfoZ1Od
ko/s65v4uwtTzECh2o+XQaNsReR5YETXxmA40N/Jpo7/7twABIoZ/ASvj/3ZBYj+
sie+u2rTod8/gQWSfHpJ
=MIMX
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Ross Zwisler:
- Require struct page by default for filesystem DAX to remove a number
of surprising failure cases. This includes failures with direct I/O,
gdb and fork(2).
- Add support for the new Platform Capabilities Structure added to the
NFIT in ACPI 6.2a. This new table tells us whether the platform
supports flushing of CPU and memory controller caches on unexpected
power loss events.
- Revamp vmem_altmap and dev_pagemap handling to clean up code and
better support future future PCI P2P uses.
- Deprecate the ND_IOCTL_SMART_THRESHOLD command whose payload has
become out-of-sync with recent versions of the NVDIMM_FAMILY_INTEL
spec, and instead rely on the generic ND_CMD_CALL approach used by
the two other IOCTL families, NVDIMM_FAMILY_{HPE,MSFT}.
- Enhance nfit_test so we can test some of the new things added in
version 1.6 of the DSM specification. This includes testing firmware
download and simulating the Last Shutdown State (LSS) status.
* tag 'libnvdimm-for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (37 commits)
libnvdimm, namespace: remove redundant initialization of 'nd_mapping'
acpi, nfit: fix register dimm error handling
libnvdimm, namespace: make min namespace size 4K
tools/testing/nvdimm: force nfit_test to depend on instrumented modules
libnvdimm/nfit_test: adding support for unit testing enable LSS status
libnvdimm/nfit_test: add firmware download emulation
nfit-test: Add platform cap support from ACPI 6.2a to test
libnvdimm: expose platform persistence attribute for nd_region
acpi: nfit: add persistent memory control flag for nd_region
acpi: nfit: Add support for detect platform CPU cache flush on power loss
device-dax: Fix trailing semicolon
libnvdimm, btt: fix uninitialized err_lock
dax: require 'struct page' by default for filesystem dax
ext2: auto disable dax instead of failing mount
ext4: auto disable dax instead of failing mount
mm, dax: introduce pfn_t_special()
mm: Fix devm_memremap_pages() collision handling
mm: Fix memory size alignment in devm_memremap_pages_release()
memremap: merge find_dev_pagemap into get_dev_pagemap
memremap: change devm_memremap_pages interface to use struct dev_pagemap
...
Fix the kernel-doc usage in cio to get rid of (W=1) build warnings like:
drivers/s390/cio/cio.c:1068: warning: No description found for parameter 'sch'
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Make sure we use proper Return sections, and make the output
for cmf_enable() less odd.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Here is the set of "big" driver core patches for 4.16-rc1.
The majority of the work here is in the firmware subsystem, with reworks
to try to attempt to make the code easier to handle in the long run, but
no functional change. There's also some tree-wide sysfs attribute
fixups with lots of acks from the various subsystem maintainers, as well
as a handful of other normal fixes and changes.
And finally, some license cleanups for the driver core and sysfs code.
All have been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWnLvPw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynNzACgkzjPoBytJWbpWFt6SR6L33/u4kEAnRFvVCGL
s6ygQPQhZIjKk2Lxa2hC
=Zihy
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of "big" driver core patches for 4.16-rc1.
The majority of the work here is in the firmware subsystem, with
reworks to try to attempt to make the code easier to handle in the
long run, but no functional change. There's also some tree-wide sysfs
attribute fixups with lots of acks from the various subsystem
maintainers, as well as a handful of other normal fixes and changes.
And finally, some license cleanups for the driver core and sysfs code.
All have been in linux-next for a while with no reported issues"
* tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (48 commits)
device property: Define type of PROPERTY_ENRTY_*() macros
device property: Reuse property_entry_free_data()
device property: Move property_entry_free_data() upper
firmware: Fix up docs referring to FIRMWARE_IN_KERNEL
firmware: Drop FIRMWARE_IN_KERNEL Kconfig option
USB: serial: keyspan: Drop firmware Kconfig options
sysfs: remove DEBUG defines
sysfs: use SPDX identifiers
drivers: base: add coredump driver ops
sysfs: add attribute specification for /sysfs/devices/.../coredump
test_firmware: fix missing unlock on error in config_num_requests_store()
test_firmware: make local symbol test_fw_config static
sysfs: turn WARN() into pr_warn()
firmware: Fix a typo in fallback-mechanisms.rst
treewide: Use DEVICE_ATTR_WO
treewide: Use DEVICE_ATTR_RO
treewide: Use DEVICE_ATTR_RW
sysfs.h: Use octal permissions
component: add debugfs support
bus: simple-pm-bus: convert bool SIMPLE_PM_BUS to tristate
...
Pull s390 updates from Martin Schwidefsky:
"Bug fixes, small improvements and one notable change: the system call
table and the unistd.h header are now generated automatically with a
shell script from a text file"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/decompressor: discard __ksymtab and .eh_frame sections
s390: fix handling of -1 in set{,fs}[gu]id16 syscalls
s390/tools: generate header files in arch/s390/include/generated/
s390/syscalls: use generated syscall_table.h and unistd.h header files
s390/syscalls: add Makefile to generate system call header files
s390/syscalls: add syscalltbl script
s390/syscalls: add system call table
s390/decompressor: swap .text and .rodata.compressed sections
s390/sclp: fix .data section specification
s390/ipl: avoid usage of __section(.data)
s390/head: replace hard coded values with constants
s390/disassembler: add generated gen_opcode_table tool to .gitignore
s390: remove bogus system call table entries
s390/kprobes: remove duplicate includes
s390/dasd: Remove dead return code checks
s390/dasd: Simplify code
s390/vdso: revise CFI annotations of vDSO functions
s390/kernel: emit CFI data in .debug_frame and discard .eh_frame sections
Pull networking updates from David Miller:
1) Significantly shrink the core networking routing structures. Result
of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf
2) Add netdevsim driver for testing various offloads, from Jakub
Kicinski.
3) Support cross-chip FDB operations in DSA, from Vivien Didelot.
4) Add a 2nd listener hash table for TCP, similar to what was done for
UDP. From Martin KaFai Lau.
5) Add eBPF based queue selection to tun, from Jason Wang.
6) Lockless qdisc support, from John Fastabend.
7) SCTP stream interleave support, from Xin Long.
8) Smoother TCP receive autotuning, from Eric Dumazet.
9) Lots of erspan tunneling enhancements, from William Tu.
10) Add true function call support to BPF, from Alexei Starovoitov.
11) Add explicit support for GRO HW offloading, from Michael Chan.
12) Support extack generation in more netlink subsystems. From Alexander
Aring, Quentin Monnet, and Jakub Kicinski.
13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
Russell King.
14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.
15) Many improvements and simplifications to the NFP driver bpf JIT,
from Jakub Kicinski.
16) Support for ipv6 non-equal cost multipath routing, from Ido
Schimmel.
17) Add resource abstration to devlink, from Arkadi Sharshevsky.
18) Packet scheduler classifier shared filter block support, from Jiri
Pirko.
19) Avoid locking in act_csum, from Davide Caratti.
20) devinet_ioctl() simplifications from Al viro.
21) More TCP bpf improvements from Lawrence Brakmo.
22) Add support for onlink ipv6 route flag, similar to ipv4, from David
Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
tls: Add support for encryption using async offload accelerator
ip6mr: fix stale iterator
net/sched: kconfig: Remove blank help texts
openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
tcp_nv: fix potential integer overflow in tcpnv_acked
r8169: fix RTL8168EP take too long to complete driver initialization.
qmi_wwan: Add support for Quectel EP06
rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
ipmr: Fix ptrdiff_t print formatting
ibmvnic: Wait for device response when changing MAC
qlcnic: fix deadlock bug
tcp: release sk_frag.page in tcp_disconnect
ipv4: Get the address of interface correctly.
net_sched: gen_estimator: fix lockdep splat
net: macb: Handle HRESP error
net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
ipv6: addrconf: break critical section in addrconf_verify_rtnl()
ipv6: change route cache aging logic
i40e/i40evf: Update DESC_NEEDED value to reflect larger value
bnxt_en: cleanup DIM work on device shutdown
...
Pull poll annotations from Al Viro:
"This introduces a __bitwise type for POLL### bitmap, and propagates
the annotations through the tree. Most of that stuff is as simple as
'make ->poll() instances return __poll_t and do the same to local
variables used to hold the future return value'.
Some of the obvious brainos found in process are fixed (e.g. POLLIN
misspelled as POLL_IN). At that point the amount of sparse warnings is
low and most of them are for genuine bugs - e.g. ->poll() instance
deciding to return -EINVAL instead of a bitmap. I hadn't touched those
in this series - it's large enough as it is.
Another problem it has caught was eventpoll() ABI mess; select.c and
eventpoll.c assumed that corresponding POLL### and EPOLL### were
equal. That's true for some, but not all of them - EPOLL### are
arch-independent, but POLL### are not.
The last commit in this series separates userland POLL### values from
the (now arch-independent) kernel-side ones, converting between them
in the few places where they are copied to/from userland. AFAICS, this
is the least disruptive fix preserving poll(2) ABI and making epoll()
work on all architectures.
As it is, it's simply broken on sparc - try to give it EPOLLWRNORM and
it will trigger only on what would've triggered EPOLLWRBAND on other
architectures. EPOLLWRBAND and EPOLLRDHUP, OTOH, are never triggered
at all on sparc. With this patch they should work consistently on all
architectures"
* 'misc.poll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
make kernel-side POLL... arch-independent
eventpoll: no need to mask the result of epi_item_poll() again
eventpoll: constify struct epoll_event pointers
debugging printk in sg_poll() uses %x to print POLL... bitmap
annotate poll(2) guts
9p: untangle ->poll() mess
->si_band gets POLL... bitmap stored into a user-visible long field
ring_buffer_poll_wait() return value used as return value of ->poll()
the rest of drivers/*: annotate ->poll() instances
media: annotate ->poll() instances
fs: annotate ->poll() instances
ipc, kernel, mm: annotate ->poll() instances
net: annotate ->poll() instances
apparmor: annotate ->poll() instances
tomoyo: annotate ->poll() instances
sound: annotate ->poll() instances
acpi: annotate ->poll() instances
crypto: annotate ->poll() instances
block: annotate ->poll() instances
x86: annotate ->poll() instances
...
The GISA format facility is required by the host to be able to process
a format-1 GISA. If not available, the used GISA format will be format-0.
All format-1 related extension will not be available in this case.
Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
"__section(data)" has to be "__section(.data)". __section(data)
produces extra "data" section in addition to ".data" section.
Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In dasd_term_IO() ccw_device_clear() is called and the return code is
checked afterwards. Though, the return codes -EIO and -EBUSY will never
be returned and can therefore be removed from the check.
In dasd_start_IO() the return code of either ccw_device_tm_start() or
ccw_device_start() is checked. However, neither of them returns
-ETIMEDOUT. Remove that check as well.
Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com>
Acked-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Use 'seq_printf(m, "...%*phN...")' instead of duplicating its
implementation.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If a dax buffer from a device that does not map pages is passed to
read(2) or write(2) as a target for direct-I/O it triggers SIGBUS. If
gdb attempts to examine the contents of a dax buffer from a device that
does not map pages it triggers SIGBUS. If fork(2) is called on a process
with a dax mapping from a device that does not map pages it triggers
SIGBUS. 'struct page' is required otherwise several kernel code paths
break in surprising ways. Disable filesystem-dax on devices that do not
map pages.
In addition to needing pfn_to_page() to be valid we also require devmap
pages. We need this to detect dax pages in the get_user_pages_fast()
path and so that we can stop managing the VM_MIXEDMAP flag. For DAX
drivers that have not supported get_user_pages() to date we allow them
to opt-in to supporting DAX with the CONFIG_FS_DAX_LIMITED configuration
option which requires ->direct_access() to return pfn_t_special() pfns.
This leaves DAX support in brd disabled and scheduled for removal.
Note that when the initial dax support was being merged a few years back
there was concern that struct page was unsuitable for use with next
generation persistent memory devices. The theoretical concern was that
struct page access, being such a hotly used data structure in the
kernel, would lead to media wear out. While that was a reasonable
conservative starting position it has not held true in practice. We have
long since committed to using devm_memremap_pages() to support higher
order kernel functionality that needs get_user_pages() and
pfn_to_page().
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In support of removing the VM_MIXEDMAP indication from DAX VMAs,
introduce pfn_t_special() for drivers to indicate that _PAGE_SPECIAL
should be used for DAX ptes. This also helps identify drivers like
dccssblk that only want to use DAX in a read-only fashion without
get_user_pages() support.
Ideally we could delete axonram and dcssblk DAX support, but if we need
to keep it better make it explicit that axonram and dcssblk only support
a sub-set of DAX due to missing _PAGE_DEVMAP support.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>