Commit Graph

855325 Commits

Author SHA1 Message Date
Mauro Carvalho Chehab
9d42afbe6b docs: pdf: add all Documentation/*/index.rst to PDF output
Currently, all index files should be manually added to the
latex_documents array at conf.py.

While this allows fine-tuning some LaTeX specific things, like
the name of the output file and the name of the document, it
is not uncommon to forget adding new documents there.

So, add a logic that will seek for all Documentation/*/index.rst.
If the index is not yet at latex_documents, it includes using
a reasonable default.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-17 06:57:51 -03:00
Mauro Carvalho Chehab
bff9e34c67 docs: fix broken doc references due to renames
Some files got renamed but probably due to some merge conflicts,
a few references still point to the old locations.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-17 06:57:51 -03:00
Mauro Carvalho Chehab
08a69058c0 docs: power: add it to to the main documentation index
The power docs are orphaned at the documentation body.

While it could likely be moved to be inside some guide, I'm opting to just
adding it to the main index.rst, removing the :orphan: and adding the SPDX
header.

The reason is similar to what it was done for other driver-specific
subsystems: the docs there contain a mix of Kernelspace, uAPI and
admin-guide. So, better to keep them on its own directory,
while the docs there are not properly classified.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
2019-07-17 06:57:51 -03:00
Mauro Carvalho Chehab
4d2e26a38f docs: powerpc: convert docs to ReST and rename to *.rst
Convert docs to ReST and add them to the arch-specific
book.

The conversion here was trivial, as almost every file there
was already using an elegant format close to ReST standard.

The changes were mostly to mark literal blocks and add a few
missing section title identifiers.

One note with regards to "--": on Sphinx, this can't be used
to identify a list, as it will format it badly. This can be
used, however, to identify a long hyphen - and "---" is an
even longer one.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> # cxl
2019-07-17 06:57:51 -03:00
Christoph Hellwig
a5008b59cd dma-direct: only limit the mapping size if swiotlb could be used
Don't just check for a swiotlb buffer, but also if buffering might
be required for this particular device.

Fixes: 133d624b1c ("dma: Introduce dma_max_mapping_size()")
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2019-07-17 08:25:45 +02:00
Christoph Hellwig
b866455423 dma-mapping: add a dma_addressing_limited helper
This helper returns if the device has issues addressing all present
memory in the system.

Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-07-17 08:25:40 +02:00
Zhenzhong Duan
b23e5844df xen/pv: Fix a boot up hang revealed by int3 self test
Commit 7457c0da02 ("x86/alternatives: Add int3_emulate_call()
selftest") is used to ensure there is a gap setup in int3 exception stack
which could be used for inserting call return address.

This gap is missed in XEN PV int3 exception entry path, then below panic
triggered:

[    0.772876] general protection fault: 0000 [#1] SMP NOPTI
[    0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11
[    0.772893] RIP: e030:int3_magic+0x0/0x7
[    0.772905] RSP: 3507:ffffffff82203e98 EFLAGS: 00000246
[    0.773334] Call Trace:
[    0.773334]  alternative_instructions+0x3d/0x12e
[    0.773334]  check_bugs+0x7c9/0x887
[    0.773334]  ? __get_locked_pte+0x178/0x1f0
[    0.773334]  start_kernel+0x4ff/0x535
[    0.773334]  ? set_init_arg+0x55/0x55
[    0.773334]  xen_start_kernel+0x571/0x57a

For 64bit PV guests, Xen's ABI enters the kernel with using SYSRET, with
%rcx/%r11 on the stack. To convert back to "normal" looking exceptions,
the xen thunks do 'xen_*: pop %rcx; pop %r11; jmp *'.

E.g. Extracting 'xen_pv_trap xenint3' we have:
xen_xenint3:
 pop %rcx;
 pop %r11;
 jmp xenint3

As xenint3 and int3 entry code are same except xenint3 doesn't generate
a gap, we can fix it by using int3 and drop useless xenint3.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:59 +02:00
Zhenzhong Duan
bef6e0ae74 x86/xen: Add "nopv" support for HVM guest
PVH guest needs PV extentions to work, so "nopv" parameter should be
ignored for PVH but not for HVM guest.

If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early,
we know it's PVH guest and ignore "nopv" parameter directly.

If PVH guest boots up via the normal boot entry same as HVM guest, it's
hard to distinguish PVH and HVM guest at that time. In this case, we
have to panic early if PVH is detected and nopv is enabled to avoid a
worse situation later.

Remove static from bool_x86_init_noop/x86_op_int_noop so they could be
used globally. Move xen_platform_hvm() after xen_hvm_guest_late_init()
to avoid compile error.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:59 +02:00
Zhenzhong Duan
cc8f3b4dd2 x86/paravirt: Remove const mark from x86_hyper_xen_hvm variable
.. as "nopv" support needs it to be changeable at boot up stage.

Checkpatch reports warning, so move variable declarations from
hypervisor.c to hypervisor.h

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:59 +02:00
Zhenzhong Duan
b39b049749 xen: Map "xen_nopv" parameter to "nopv" and mark it obsolete
Clean up unnecessory code after that operation.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:58 +02:00
Zhenzhong Duan
3097834637 x86: Add "nopv" parameter to disable PV extensions
In virtualization environment, PV extensions (drivers, interrupts,
timers, etc) are enabled in the majority of use cases which is the
best option.

However, in some cases (kexec not fully working, benchmarking)
we want to disable PV extensions. We have "xen_nopv" for that purpose
but only for XEN. For a consistent admin experience a common command
line parameter "nopv" set across all PV guest implementations is a
better choice.

There are guest types which just won't work without PV extensions,
like Xen PV, Xen PVH and jailhouse. add a "ignore_nopv" member to
struct hypervisor_x86 set to true for those guest types and call
the detect functions only if nopv is false or ignore_nopv is true.

Suggested-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:58 +02:00
Zhenzhong Duan
1b37683cda x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __init
.. as they are only called at early bootup stage. In fact, other
functions in x86_hyper_xen_hvm.init.* are all marked as __init.

Unexport xen_hvm_need_lapic as it's never used outside.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:58 +02:00
Juergen Gross
814bbf49dc xen: remove tmem driver
The Xen tmem (transcendent memory) driver can be removed, as the
related Xen hypervisor feature never made it past the "experimental"
state and will be removed in future Xen versions (>= 4.13).

The xen-selfballoon driver depends on tmem, so it can be removed, too.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:58 +02:00
Zhenzhong Duan
090d54bcbc Revert "x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized"
This reverts commit ca5d376e17.

Commit 8990cac6e5 ("x86/jump_label: Initialize static branching
early") adds jump_label_init() call in setup_arch() to make static
keys initialized early, so we could use the original simpler code
again.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:57 +02:00
Juergen Gross
bce5963bcb xen/events: fix binding user event channels to cpus
When binding an interdomain event channel to a vcpu via
IOCTL_EVTCHN_BIND_INTERDOMAIN not only the event channel needs to be
bound, but the affinity of the associated IRQi must be changed, too.
Otherwise the IRQ and the event channel won't be moved to another vcpu
in case the original vcpu they were bound to is going offline.

Cc: <stable@vger.kernel.org> # 4.13
Fixes: c48f64ab47 ("xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17 08:09:57 +02:00
Christoph Hellwig
07d9aa1434 scsi: megaraid_sas: set an unlimited max_segment_size
When using a virt_boundary_mask, as done for NVMe devices attached to
megaraid_sas controllers, we require an unlimited max_segment_size as the
virt boundary merging code assumes that.  But we also need to propagate
that to the DMA mapping layer to make dma-debug happy.  The SCSI layer
takes care of that when using the per-host virt_boundary setting, but
given that megaraid_sas only wants to set the virt_boundary for actual
NVMe devices, we can't rely on that.  The DMA layer maximum segment is
global to the HBA however, so we have to set it explicitly.  This patch
assumes that megaraid_sas does not have a segment size limitation, which
seems true based on the SGL format, but will need to be verified.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 23:02:11 -04:00
Christoph Hellwig
ce0ad85310 scsi: mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAs
When using a virt_boundary_mask, as done for NVMe devices attached to
mpt3sas controllers, we require an unlimited max_segment_size as the virt
boundary merging code assumes that.  But we also need to propagate that to
the DMA mapping layer to make dma-debug happy.  The SCSI layer takes care
of that when using the per-host virt_boundary setting, but given that
mpt3sas only wants to set the virt_boundary for actual NVMe devices, we
can't rely on that.  The DMA layer maximum segment is global to the HBA
however, so we have to set it explicitly.  This patch assumes that mpt3sas
does not have a segment size limitation, which seems true based on the SGL
format, but will need to be verified.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 23:02:11 -04:00
Christoph Hellwig
8c175d3131 scsi: IB/srp: set virt_boundary_mask in the scsi host
This ensures all proper DMA layer handling is taken care of by the SCSI
midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 23:02:11 -04:00
Christoph Hellwig
09a4460ba4 scsi: IB/iser: set virt_boundary_mask in the scsi host
This ensures all proper DMA layer handling is taken care of by the SCSI
midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 23:02:10 -04:00
Christoph Hellwig
83eed4592f scsi: storvsc: set virt_boundary_mask in the scsi host template
This ensures all proper DMA layer handling is taken care of by the SCSI
midlayer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 23:02:10 -04:00
Christoph Hellwig
552a990ca1 scsi: ufshcd: set max_segment_size in the scsi host template
We need to also mirror the value to the device to ensure IOMMU merging
doesn't undo it, and the SCSI host level parameter will ensure that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 23:01:49 -04:00
Christoph Hellwig
bdd17bdef7 scsi: core: take the DMA max mapping size into account
We need to limit the device's max_sectors to what the DMA mapping
implementation can support.  If not, we risk running out of swiotlb
buffers easily.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 23:01:49 -04:00
Christoph Hellwig
7ad388d8e4 scsi: core: add a host / host template field for the virt boundary
This allows drivers setting it up easily instead of branching out to block
layer calls in slave_alloc, and ensures the upgraded max_segment_size
setting gets picked up by the DMA layer.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Kashyap Desai < kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 23:01:49 -04:00
Al Viro
56cbb429d9 switch the remnants of releasing the mountpoint away from fs_pin
We used to need rather convoluted ordering trickery to guarantee
that dput() of ex-mountpoints happens before the final mntput()
of the same.  Since we don't need that anymore, there's no point
playing with fs_pin for that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-16 22:52:37 -04:00
Al Viro
2763d11912 get rid of detach_mnt()
Lift getting the original mount (dentry is actually not needed at all)
of the mountpoint into the callers - to do_move_mount() and pivot_root()
level.  That simplifies the cleanup in those and allows to get saner
arguments for attach_mnt_recursive().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-16 22:50:11 -04:00
Pankaj Gupta
8c2e408e73 virtio_pmem: fix sparse warning
This patch fixes below sparse warning related to __virtio
type in virtio pmem driver. This is reported by Intel test
bot on linux-next tree.

nd_virtio.c:56:28: warning: incorrect type in assignment
                                (different base types)
nd_virtio.c:56:28:    expected unsigned int [unsigned] [usertype] type
nd_virtio.c:56:28:    got restricted __virtio32
nd_virtio.c:93:59: warning: incorrect type in argument 2
                                (different base types)
nd_virtio.c:93:59:    expected restricted __virtio32 [usertype] val
nd_virtio.c:93:59:    got unsigned int [unsigned] [usertype] ret

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-07-16 19:44:26 -07:00
Al Viro
4edbe133f8 make struct mountpoint bear the dentry reference to mountpoint, not struct mount
Using dput_to_list() to shift the contributing reference from ->mnt_mountpoint
to ->mnt_mp->m_dentry.  Dentries are dropped (with dput_to_list()) as soon
as struct mountpoint is destroyed; in cases where we are under namespace_sem
we use the global list, shrinking it in namespace_unlock().  In case of
detaching stuck MNT_LOCKed children at final mntput_no_expire() we use a local
list and shrink it ourselves.  ->mnt_ex_mountpoint crap is gone.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-16 22:43:40 -04:00
Ming Lei
f9b0530fa0 scsi: core: Fix race on creating sense cache
When scsi_init_sense_cache(host) is called concurrently from different
hosts, each code path may find that no cache has been created and
allocate a new one. The lack of locking can lead to potentially
overriding a cache allocated by a different host.

Fix the issue by moving 'mutex_lock(&scsi_sense_cache_mutex)' before
scsi_select_sense_cache().

Fixes: 0a6ac4ee7c ("scsi: respect unchecked_isa_dma for blk-mq")
Cc: Stable <stable@vger.kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 22:39:24 -04:00
Damien Le Moal
0cdc58580b scsi: sd_zbc: Fix compilation warning
kbuild test robot gets the following compilation warning using gcc 7.4
cross compilation for c6x (GCC_VERSION=7.4.0 make.cross ARCH=c6x).

   In file included from include/asm-generic/bug.h:18:0,
                    from arch/c6x/include/asm/bug.h:12,
                    from include/linux/bug.h:5,
                    from include/linux/thread_info.h:12,
                    from include/asm-generic/current.h:5,
                    from ./arch/c6x/include/generated/asm/current.h:1,
                    from include/linux/sched.h:12,
                    from include/linux/blkdev.h:5,
                    from drivers//scsi/sd_zbc.c:11:
   drivers//scsi/sd_zbc.c: In function 'sd_zbc_read_zones':
>> include/linux/kernel.h:62:48: warning: 'zone_blocks' may be used
   uninitialized in this function [-Wmaybe-uninitialized]
    #define __round_mask(x, y) ((__typeof__(x))((y)-1))
                                                   ^
   drivers//scsi/sd_zbc.c:464:6: note: 'zone_blocks' was declared here
     u32 zone_blocks;
         ^~~~~~~~~~~

This is a false-positive report. The variable zone_blocks is always
initialized in sd_zbc_check_zones() before use. It is not initialized
only and only if sd_zbc_check_zones() fails.

Avoid this warning by initializing the zone_blocks variable to 0.

Fixes: 5f832a3958 ("scsi: sd_zbc: Fix sd_zbc_check_zones() error checks")
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 22:37:22 -04:00
Colin Ian King
41a6bf6529 scsi: libfc: fix null pointer dereference on a null lport
Currently if lport is null then the null lport pointer is dereference when
printing out debug via the FC_LPORT_DB macro. Fix this by using the more
generic FC_LIBFC_DBG debug macro instead that does not use lport.

Addresses-Coverity: ("Dereference after null check")
Fixes: 7414705ea4 ("libfc: Add runtime debugging with debug_logging module parameter")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16 22:32:24 -04:00
Matthew Wilcox (Oracle)
23c84eb783 dax: Fix missed wakeup with PMD faults
RocksDB can hang indefinitely when using a DAX file.  This is due to
a bug in the XArray conversion when handling a PMD fault and finding a
PTE entry.  We use the wrong index in the hash and end up waiting on
the wrong waitqueue.

There's actually no need to wait; if we find a PTE entry while looking
for a PMD entry, we can return immediately as we know we should fall
back to a PTE fault (which may not conflict with the lock held).

We reuse the XA_RETRY_ENTRY to signal a conflicting entry was found.
This value can never be found in an XArray while holding its lock, so
it does not create an ambiguity.

Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/CAPcyv4hwHpX-MkUEqxwdTj7wCCZCN4RV-L4jsnuwLGyL_UEG4A@mail.gmail.com
Fixes: b15cd80068 ("dax: Convert page fault handlers to XArray")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Reported-by: Robert Barror <robert.barror@intel.com>
Reported-by: Seema Pandit <seema.pandit@intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-07-16 19:30:59 -07:00
Gustavo A. R. Silva
43e11fa2d1 fs/select.c: use struct_size() in kmalloc()
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array.  For example:

  struct foo {
       int stuff;
       struct boo entry[];
  };

  size = sizeof(struct foo) + count * sizeof(struct boo);
  instance = kmalloc(size, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can now
use the new struct_size() helper:

  instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);

Also, notice that variable size is unnecessary, hence it is removed.

This code was detected with the help of Coccinelle.

Link: http://lkml.kernel.org/r/20190604164226.GA13823@embeddedor
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:25 -07:00
Daniel Jordan
79eb597cba mm: add account_locked_vm utility function
locked_vm accounting is done roughly the same way in five places, so
unify them in a helper.

Include the helper's caller in the debug print to distinguish between
callsites.

Error codes stay the same, so user-visible behavior does too.  The one
exception is that the -EPERM case in tce_account_locked_vm is removed
because Alexey has never seen it triggered.

[daniel.m.jordan@oracle.com: v3]
  Link: http://lkml.kernel.org/r/20190529205019.20927-1-daniel.m.jordan@oracle.com
[sfr@canb.auug.org.au: fix mm/util.c]
Link: http://lkml.kernel.org/r/20190524175045.26897-1-daniel.m.jordan@oracle.com
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Alan Tull <atull@kernel.org>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Moritz Fischer <mdf@kernel.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Steve Sistare <steven.sistare@oracle.com>
Cc: Wu Hao <hao.wu@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:25 -07:00
Robin Murphy
73b20c84d4 arm64: mm: implement pte_devmap support
In order for things like get_user_pages() to work on ZONE_DEVICE memory,
we need a software PTE bit to identify device-backed PFNs.  Hook this up
along with the relevant helpers to join in with ARCH_HAS_PTE_DEVMAP.

[robin.murphy@arm.com: build fixes]
  Link: http://lkml.kernel.org/r/13026c4e64abc17133bbfa07d7731ec6691c0bcd.1559050949.git.robin.murphy@arm.com
Link: http://lkml.kernel.org/r/817d92886fc3b33bcbf6e105ee83a74babb3a5aa.1558547956.git.robin.murphy@arm.com
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:25 -07:00
Robin Murphy
175967318c mm: introduce ARCH_HAS_PTE_DEVMAP
ARCH_HAS_ZONE_DEVICE is somewhat meaningless in itself, and combined
with the long-out-of-date comment can lead to the impression than an
architecture may just enable it (since __add_pages() now "comprehends
device memory" for itself) and expect things to work.

In practice, however, ZONE_DEVICE users have little chance of
functioning correctly without __HAVE_ARCH_PTE_DEVMAP, so let's clean
that up the same way as ARCH_HAS_PTE_SPECIAL and make it the proper
dependency so the real situation is clearer.

Link: http://lkml.kernel.org/r/87554aa78478a02a63f2c4cf60a847279ae3eb3b.1558547956.git.robin.murphy@arm.com
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:25 -07:00
Robin Murphy
7588adf8df mm: clean up is_device_*_page() definitions
Refactor is_device_{public,private}_page() with is_pci_p2pdma_page() to
make them all consistent in depending on their respective config options
even when CONFIG_DEV_PAGEMAP_OPS is enabled for other reasons.  This
allows a little more compile-time optimisation as well as the conceptual
and cosmetic cleanup.

Link: http://lkml.kernel.org/r/187c2ab27dea70635d375a61b2f2076d26c032b0.1558547956.git.robin.murphy@arm.com
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Suggested-by: Jerome Glisse <jglisse@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:25 -07:00
Aneesh Kumar K.V
8aa3c927ec mm/mmap: move common defines to mman-common.h
Two architecture that use arch specific MMAP flags are powerpc and
sparc.  We still have few flag values common across them and other
architectures.  Consolidate this in mman-common.h.

Also update the comment to indicate where to find HugeTLB specific
reserved values

Link: http://lkml.kernel.org/r/20190604090950.31417-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:25 -07:00
Aneesh Kumar K.V
22fcea6f85 mm: move MAP_SYNC to asm-generic/mman-common.h
This enables support for synchronous DAX fault on powerpc

The generic changes are added as part of b6fb293f24 ("mm: Define
MAP_SYNC and VM_SYNC flags")

Without this, mmap returns EOPNOTSUPP for MAP_SYNC with
MAP_SHARED_VALIDATE

Instead of adding MAP_SYNC with same value to
arch/powerpc/include/uapi/asm/mman.h, I am moving the #define to
asm-generic/mman-common.h.  Two architectures using mman-common.h
directly are sparc and powerpc.  We should be able to consloidate more
#defines to mman-common.h.  That can be done as a separate patch.

Link: http://lkml.kernel.org/r/20190528091120.13322-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:25 -07:00
Pavel Tatashin
9f960da72b device-dax: "Hotremove" persistent memory that is used like normal RAM
It is now allowed to use persistent memory like a regular RAM, but
currently there is no way to remove this memory until machine is
rebooted.

This work expands the functionality to also allows hotremoving
previously hotplugged persistent memory, and recover the device for use
for other purposes.

To hotremove persistent memory, the management software must first
offline all memory blocks of dax region, and than unbind it from
device-dax/kmem driver.  So, operations should look like this:

  echo offline > /sys/devices/system/memory/memoryN/state
  ...
  echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind

Note: if unbind is done without offlining memory beforehand, it won't be
possible to do dax0.0 hotremove, and dax's memory is going to be part of
System RAM until reboot.

Link: http://lkml.kernel.org/r/20190517215438.6487-4-pasha.tatashin@soleen.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Pavel Tatashin
eca499ab37 mm/hotplug: make remove_memory() interface usable
Presently the remove_memory() interface is inherently broken.  It tries
to remove memory but panics if some memory is not offline.  The problem
is that it is impossible to ensure that all memory blocks are offline as
this function also takes lock_device_hotplug that is required to change
memory state via sysfs.

So, between calling this function and offlining all memory blocks there
is always a window when lock_device_hotplug is released, and therefore,
there is always a chance for a panic during this window.

Make this interface to return an error if memory removal fails.  This
way it is safe to call this function without panicking machine, and also
makes it symmetric to add_memory() which already returns an error.

Link: http://lkml.kernel.org/r/20190517215438.6487-3-pasha.tatashin@soleen.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Pavel Tatashin
31e4ca92a7 device-dax: fix memory and resource leak if hotplug fails
Patch series ""Hotremove" persistent memory", v6.

Recently, adding a persistent memory to be used like a regular RAM was
added to Linux.  This work extends this functionality to also allow hot
removing persistent memory.

We (Microsoft) have an important use case for this functionality.

The requirement is for physical machines with small amount of RAM (~8G)
to be able to reboot in a very short period of time (<1s).  Yet, there
is a userland state that is expensive to recreate (~2G).

The solution is to boot machines with 2G preserved for persistent
memory.

Copy the state, and hotadd the persistent memory so machine still has
all 8G available for runtime.  Before reboot, offline and hotremove
device-dax 2G, copy the memory that is needed to be preserved to pmem0
device, and reboot.

The series of operations look like this:

1. After boot restore /dev/pmem0 to ramdisk to be consumed by apps.
   and free ramdisk.
2. Convert raw pmem0 to devdax
   ndctl create-namespace --mode devdax --map mem -e namespace0.0 -f
3. Hotadd to System RAM
   echo dax0.0 > /sys/bus/dax/drivers/device_dax/unbind
   echo dax0.0 > /sys/bus/dax/drivers/kmem/new_id
   echo online_movable > /sys/devices/system/memoryXXX/state
4. Before reboot hotremove device-dax memory from System RAM
   echo offline > /sys/devices/system/memoryXXX/state
   echo dax0.0 > /sys/bus/dax/drivers/kmem/unbind
5. Create raw pmem0 device
   ndctl create-namespace --mode raw  -e namespace0.0 -f
6. Copy the state that was stored by apps to ramdisk to pmem device
7. Do kexec reboot or reboot through firmware if firmware does not
   zero memory in pmem0 region (These machines have only regular
   volatile memory). So to have pmem0 device either memmap kernel
   parameter is used, or devices nodes in dtb are specified.

This patch (of 3):

When add_memory() fails, the resource and the memory should be freed.

Link: http://lkml.kernel.org/r/20190517215438.6487-2-pasha.tatashin@soleen.com
Fixes: c221c0b030 ("device-dax: "Hotplug" persistent memory for use like normal RAM")
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Ross Zwisler <zwisler@kernel.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Tom Levy
97a0efea65 include/linux/lz4.h: fix spelling and copy-paste errors in documentation
Fix a few spelling and grammar errors, and two places where fast/safe in
the documentation did not match the function.

Link: http://lkml.kernel.org/r/20190321014452.13297-1-tomlevy93@gmail.com
Signed-off-by: Tom Levy <tomlevy93@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Kosina <trivial@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Kees Cook
a318f12ed8 ipc/mqueue.c: only perform resource calculation if user valid
Andreas Christoforou reported:

  UBSAN: Undefined behaviour in ipc/mqueue.c:414:49 signed integer overflow:
  9 * 2305843009213693951 cannot be represented in type 'long int'
  ...
  Call Trace:
    mqueue_evict_inode+0x8e7/0xa10 ipc/mqueue.c:414
    evict+0x472/0x8c0 fs/inode.c:558
    iput_final fs/inode.c:1547 [inline]
    iput+0x51d/0x8c0 fs/inode.c:1573
    mqueue_get_inode+0x8eb/0x1070 ipc/mqueue.c:320
    mqueue_create_attr+0x198/0x440 ipc/mqueue.c:459
    vfs_mkobj+0x39e/0x580 fs/namei.c:2892
    prepare_open ipc/mqueue.c:731 [inline]
    do_mq_open+0x6da/0x8e0 ipc/mqueue.c:771

Which could be triggered by:

        struct mq_attr attr = {
                .mq_flags = 0,
                .mq_maxmsg = 9,
                .mq_msgsize = 0x1fffffffffffffff,
                .mq_curmsgs = 0,
        };

        if (mq_open("/testing", 0x40, 3, &attr) == (mqd_t) -1)
                perror("mq_open");

mqueue_get_inode() was correctly rejecting the giant mq_msgsize, and
preparing to return -EINVAL.  During the cleanup, it calls
mqueue_evict_inode() which performed resource usage tracking math for
updating "user", before checking if there was a valid "user" at all
(which would indicate that the calculations would be sane).  Instead,
delay this check to after seeing a valid "user".

The overflow was real, but the results went unused, so while the flaw is
harmless, it's noisy for kernel fuzzers, so just fix it by moving the
calculation under the non-NULL "user" where it actually gets used.

Link: http://lkml.kernel.org/r/201906072207.ECB65450@keescook
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Andreas Christoforou <andreaschristofo@gmail.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Drew Davenport
6b15f678fb include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures
For architectures using __WARN_TAINT, the WARN_ON macro did not print
out the "cut here" string.  The other WARN_XXX macros would print "cut
here" inside __warn_printk, which is not called for WARN_ON since it
doesn't have a message to print.

Link: http://lkml.kernel.org/r/20190624154831.163888-1-ddavenport@chromium.org
Fixes: a7bed27af1 ("bug: fix "cut here" location for __WARN_TAINT architectures")
Signed-off-by: Drew Davenport <ddavenport@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Leonard Crestez
778c1f5ccb scripts/gdb: add helpers to find and list devices
Add helper commands and functions for finding pointers to struct device
by enumerating linux device bus/class infrastructure.  This can be used
to fetch subsystem and driver-specific structs:

  (gdb) p *$container_of($lx_device_find_by_class_name("net", "eth0"), "struct net_device", "dev")
  (gdb) p *$container_of($lx_device_find_by_bus_name("i2c", "0-004b"), "struct i2c_client", "dev")
  (gdb) p *(struct imx_port*)$lx_device_find_by_class_name("tty", "ttymxc1")->parent->driver_data

Several generic "lx-device-list" functions are included to enumerate
devices by bus and class:

  (gdb) lx-device-list-bus usb
  (gdb) lx-device-list-class
  (gdb) lx-device-list-tree &platform_bus

Similar information is available in /sys but pointer values are
deliberately hidden.

Link: http://lkml.kernel.org/r/c948628041311cbf1b9b4cff3dda7d2073cb3eaa.1561492937.git.leonard.crestez@nxp.com
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Leonard Crestez
8207d4a88e scripts/gdb: add lx-genpd-summary command
This is like /sys/kernel/debug/pm/pm_genpd_summary except it's
accessible through a debugger.

This can be useful if the target crashes or hangs because power domains
were not properly enabled.

Link: http://lkml.kernel.org/r/f9ee627a0d4f94b894aa202fee8a98444049bed8.1561492937.git.leonard.crestez@nxp.com
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Cc: Kieran Bingham <kbingham@kernel.org>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Miroslav Lichvar
5515e9a627 drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
The PPS assert/clear offset corrections are set by the PPS_SETPARAMS
ioctl in the pps_ktime structs, which also contain flags.  The flags are
not initialized by applications (using the timepps.h header) and they
are not used by the kernel for anything except returning them back in
the PPS_GETPARAMS ioctl.

Set the flags to zero to make it clear they are unused and avoid leaking
uninitialized data of the PPS_SETPARAMS caller to other applications
that have a read access to the PPS device.

Link: http://lkml.kernel.org/r/20190702092251.24303-1-mlichvar@redhat.com
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rodolfo Giometti <giometti@enneenne.com>
Cc: Greg KH <greg@kroah.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Joel Fernandes (Google)
f57e515a1b kernel/pid.c: convert struct pid count to refcount_t
struct pid's count is an atomic_t field used as a refcount.  Use
refcount_t for it which is basically atomic_t but does additional
checking to prevent use-after-free bugs.

For memory ordering, the only change is with the following:

 -	if ((atomic_read(&pid->count) == 1) ||
 -	     atomic_dec_and_test(&pid->count)) {
 +	if (refcount_dec_and_test(&pid->count)) {
 		kmem_cache_free(ns->pid_cachep, pid);

Here the change is from: Fully ordered --> RELEASE + ACQUIRE (as per
refcount-vs-atomic.rst) This ACQUIRE should take care of making sure the
free happens after the refcount_dec_and_test().

The above hunk also removes atomic_read() since it is not needed for the
code to work and it is unclear how beneficial it is.  The removal lets
refcount_dec_and_test() check for cases where get_pid() happened before
the object was freed.

Link: http://lkml.kernel.org/r/20190701183826.191936-1-joel@joelfernandes.org
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Jann Horn <jannh@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Dan Carpenter
156e0b1a81 drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
The dev_info.name[] array has space for RIO_MAX_DEVNAME_SZ + 1
characters.  But the problem here is that we don't ensure that the user
put a NUL terminator on the end of the string.  It could lead to an out
of bounds read.

Link: http://lkml.kernel.org/r/20190529110601.GB19119@mwanda
Fixes: e8de370188 ("rapidio: add mport char device driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00
Oleg Nesterov
ac30102062 select: shift restore_saved_sigmask_unless() into poll_select_copy_remaining()
Now that restore_saved_sigmask_unless() is always called with the same
argument right before poll_select_copy_remaining() we can move it into
poll_select_copy_remaining() and make it the only caller of restore() in
fs/select.c.

The patch also renames poll_select_copy_remaining(),
poll_select_finish() looks better after this change.

kern_select() doesn't use set_user_sigmask(), so in this case
poll_select_finish() does restore_saved_sigmask_unless() "for no
reason".  But this won't hurt, and WARN_ON(!TIF_SIGPENDING) is still
valid.

Link: http://lkml.kernel.org/r/20190606140915.GC13440@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Laight <David.Laight@aculab.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Eric Wong <e@80x24.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-16 19:23:24 -07:00