Commit Graph

948892 Commits

Author SHA1 Message Date
Christopher Lameter
aa456c7aeb slub: remove kmalloc under list_lock from list_slab_objects() V2
list_slab_objects() is called when a slab is destroyed and there are
objects still left to list the objects in the syslog.  This is a pretty
rare event.

And there it seems we take the list_lock and call kmalloc while holding
that lock.

Perform the allocation in free_partial() before the list_lock is taken.

Fixes: bbd7d57bfe ("slub: Potential stack overflow")
Signed-off-by: Christopher Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Yu Zhao <yuzhao@google.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.21.2002031721250.1668@www.lameter.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:06 -07:00
Christoph Lameter
d7660ce591 slub: Remove userspace notifier for cache add/remove
I came across some unnecessary uevents once again which reminded me
this.  The patch seems to be lost in the leaves of the original
discussion [1], so resending.

[1] https://lore.kernel.org/r/alpine.DEB.2.21.2001281813130.745@www.lameter.com

Kmem caches are internal kernel structures so it is strange that
userspace notifiers would be needed.  And I am not aware of any use of
these notifiers.  These notifiers may just exist because in the initial
slub release the sysfs code was copied from another subsystem.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Koutný <mkoutny@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/20200423115721.19821-1-mkoutny@suse.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:06 -07:00
Dongli Zhang
52f2347808 mm/slub.c: fix corrupted freechain in deactivate_slab()
The slub_debug is able to fix the corrupted slab freelist/page.
However, alloc_debug_processing() only checks the validity of current
and next freepointer during allocation path.  As a result, once some
objects have their freepointers corrupted, deactivate_slab() may lead to
page fault.

Below is from a test kernel module when 'slub_debug=PUF,kmalloc-128
slub_nomerge'.  The test kernel corrupts the freepointer of one free
object on purpose.  Unfortunately, deactivate_slab() does not detect it
when iterating the freechain.

  BUG: unable to handle page fault for address: 00000000123456f8
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  ... ...
  RIP: 0010:deactivate_slab.isra.92+0xed/0x490
  ... ...
  Call Trace:
   ___slab_alloc+0x536/0x570
   __slab_alloc+0x17/0x30
   __kmalloc+0x1d9/0x200
   ext4_htree_store_dirent+0x30/0xf0
   htree_dirblock_to_tree+0xcb/0x1c0
   ext4_htree_fill_tree+0x1bc/0x2d0
   ext4_readdir+0x54f/0x920
   iterate_dir+0x88/0x190
   __x64_sys_getdents+0xa6/0x140
   do_syscall_64+0x49/0x170
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

Therefore, this patch adds extra consistency check in deactivate_slab().
Once an object's freepointer is corrupted, all following objects
starting at this object are isolated.

[akpm@linux-foundation.org: fix build with CONFIG_SLAB_DEBUG=n]
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/20200331031450.12182-1-dongli.zhang@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:06 -07:00
Vlastimil Babka
49f2d2419d usercopy: mark dma-kmalloc caches as usercopy caches
We have seen a "usercopy: Kernel memory overwrite attempt detected to
SLUB object 'dma-kmalloc-1 k' (offset 0, size 11)!" error on s390x, as
IUCV uses kmalloc() with __GFP_DMA because of memory address
restrictions.  The issue has been discussed [2] and it has been noted
that if all the kmalloc caches are marked as usercopy, there's little
reason not to mark dma-kmalloc caches too.  The 'dma' part merely means
that __GFP_DMA is used to restrict memory address range.

As Jann Horn put it [3]:
 "I think dma-kmalloc slabs should be handled the same way as normal
  kmalloc slabs. When a dma-kmalloc allocation is freshly created, it is
  just normal kernel memory - even if it might later be used for DMA -,
  and it should be perfectly fine to copy_from_user() into such
  allocations at that point, and to copy_to_user() out of them at the
  end. If you look at the places where such allocations are created, you
  can see things like kmemdup(), memcpy() and so on - all normal
  operations that shouldn't conceptually be different from usercopy in
  any relevant way."

Thus this patch marks the dma-kmalloc-* caches as usercopy.

[1] https://bugzilla.suse.com/show_bug.cgi?id=1156053
[2] https://lore.kernel.org/kernel-hardening/bfca96db-bbd0-d958-7732-76e36c667c68@suse.cz/
[3] https://lore.kernel.org/kernel-hardening/CAG48ez1a4waGk9kB0WLaSbs4muSoK0AYAVk8=XYaKj4_+6e6Hg@mail.gmail.com/

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Jiri Slaby <jslaby@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christopher Lameter <cl@linux.com>
Cc: Julian Wiedmann <jwi@linux.ibm.com>
Cc: Ursula Braun <ubraun@linux.ibm.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: David Windsor <dave@nullcore.net>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Dave Kleikamp <dave.kleikamp@oracle.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Luis de Bethencourt <luisbg@kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Matthew Garrett <mjg59@google.com>
Cc: Michal Kubecek <mkubecek@suse.cz>
Link: http://lkml.kernel.org/r/7d810f6d-8085-ea2f-7805-47ba3842dc50@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:06 -07:00
Jeff Layton
485e9605c0 fs/buffer.c: record blockdev write errors in super_block that it backs
When syncing out a block device (a'la __sync_blockdev), any error
encountered will only be recorded in the bd_inode's mapping.  When the
blockdev contains a filesystem however, we'd like to also record the
error in the super_block that's stored there.

Make mark_buffer_write_io_error also record the error in the
corresponding super_block when a writeback error occurs and the block
device contains a mounted superblock.

Since superblocks are RCU freed, hold the rcu_read_lock to ensure that
the superblock doesn't go away while we're marking it.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andres Freund <andres@anarazel.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Link: http://lkml.kernel.org/r/20200428135155.19223-3-jlayton@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:05 -07:00
Jeff Layton
735e4ae5ba vfs: track per-sb writeback errors and report them to syncfs
Patch series "vfs: have syncfs() return error when there are writeback
errors", v6.

Currently, syncfs does not return errors when one of the inodes fails to
be written back.  It will return errors based on the legacy AS_EIO and
AS_ENOSPC flags when syncing out the block device fails, but that's not
particularly helpful for filesystems that aren't backed by a blockdev.
It's also possible for a stray sync to lose those errors.

The basic idea in this set is to track writeback errors at the
superblock level, so that we can quickly and easily check whether
something bad happened without having to fsync each file individually.
syncfs is then changed to reliably report writeback errors after they
occur, much in the same fashion as fsync does now.

This patch (of 2):

Usually we suggest that applications call fsync when they want to ensure
that all data written to the file has made it to the backing store, but
that can be inefficient when there are a lot of open files.

Calling syncfs on the filesystem can be more efficient in some
situations, but the error reporting doesn't currently work the way most
people expect.  If a single inode on a filesystem reports a writeback
error, syncfs won't necessarily return an error.  syncfs only returns an
error if __sync_blockdev fails, and on some filesystems that's a no-op.

It would be better if syncfs reported an error if there were any
writeback failures.  Then applications could call syncfs to see if there
are any errors on any open files, and could then call fsync on all of
the other descriptors to figure out which one failed.

This patch adds a new errseq_t to struct super_block, and has
mapping_set_error also record writeback errors there.

To report those errors, we also need to keep an errseq_t in struct file
to act as a cursor.  This patch adds a dedicated field for that purpose,
which slots nicely into 4 bytes of padding at the end of struct file on
x86_64.

An earlier version of this patch used an O_PATH file descriptor to cue
the kernel that the open file should track the superblock error and not
the inode's writeback error.

I think that API is just too weird though.  This is simpler and should
make syncfs error reporting "just work" even if someone is multiplexing
fsync and syncfs on the same fds.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Andres Freund <andres@anarazel.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/20200428135155.19223-1-jlayton@kernel.org
Link: http://lkml.kernel.org/r/20200428135155.19223-2-jlayton@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:05 -07:00
Andrew Morton
78128fabd0 arch/parisc/include/asm/pgtable.h: remove unused `old_pte'
parisc's set_pte_at() macro has set-but-not-used variable:

  include/linux/pgtable.h: In function 'pte_clear_not_present_full':
  arch/parisc/include/asm/pgtable.h:96:9: warning: variable 'old_pte' set but not used [-Wunused-but-set-variable]

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:05 -07:00
Gang He
912f655d78 ocfs2: mount shared volume without ha stack
Usually we create and use a ocfs2 shared volume on the top of ha stack.
For pcmk based ha stack, which includes DLM, corosync and pacemaker
services.

The customers complained they could not mount existent ocfs2 volume in
the single node without ha stack, e.g.  single node backup/restore
scenario.

Like this case, the customers just want to access the data from the
existent ocfs2 volume quickly, but do not want to restart or setup ha
stack.

Then, I'd like to add a mount option "nocluster", if the users use this
option to mount a ocfs2 shared volume, the whole mount will not depend
on the ha related services.  the command will mount the existent ocfs2
volume directly (like local mount), for avoiding setup the ha stack.

Signed-off-by: Gang He <ghe@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Link: http://lkml.kernel.org/r/20200423053300.22661-1-ghe@suse.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:05 -07:00
Jules Irenge
8f745e62a1 ocfs2: add missing annotation for dlm_empty_lockres()
Sparse reports a warning at dlm_empty_lockres()

  warning: context imbalance in dlm_purge_lockres() - unexpected unlock

The root cause is the missing annotation at dlm_purge_lockres()

Add the missing __must_hold(&dlm->spinlock)

Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Link: http://lkml.kernel.org/r/20200403160505.2832-4-jbi.octave@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:05 -07:00
Philippe Liard
93e72b3c61 squashfs: migrate from ll_rw_block usage to BIO
ll_rw_block() function has been deprecated in favor of BIO which appears
to come with large performance improvements.

This patch decreases boot time by close to 40% when using squashfs for
the root file-system.  This is observed at least in the context of
starting an Android VM on Chrome OS using crosvm.  The patch was tested
on 4.19 as well as master.

This patch is largely based on Adrien Schildknecht's patch that was
originally sent as https://lkml.org/lkml/2017/9/22/814 though with some
significant changes and simplifications while also taking Phillip
Lougher's feedback into account, around preserving support for
FILE_CACHE in particular.

[akpm@linux-foundation.org: fix build error reported by Randy]
  Link: http://lkml.kernel.org/r/319997c2-5fc8-f889-2ea3-d913308a7c1f@infradead.org
Signed-off-by: Philippe Liard <pliard@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Adrien Schildknecht <adrien+dev@schischi.me>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Daniel Rosenberg <drosen@google.com>
Link: https://chromium.googlesource.com/chromiumos/platform/crosvm
Link: http://lkml.kernel.org/r/20191106074238.186023-1-pliard@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:59:05 -07:00
Bob Peterson
ea22eee4e6 gfs2: Allow lock_nolock mount to specify jid=X
Before this patch, a simple typo accidentally added \n to the jid=
string for lock_nolock mounts. This made it impossible to mount a
gfs2 file system with a journal other than journal0. Thus:

mount -tgfs2 -o hostdata="jid=1" <device> <mount pt>

Resulted in:
mount: wrong fs type, bad option, bad superblock on <device>

In most cases this is not a problem. However, for debugging and
testing purposes we sometimes want to test the integrity of other
journals. This patch removes the unnecessary \n and thus allows
lock_nolock users to specify an alternate journal.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-06-02 19:45:05 +02:00
Bob Peterson
bbae10fac2 gfs2: Don't ignore inode write errors during inode_go_sync
Before for this patch, function inode_go_sync ignored io errors
during inode_go_sync, overwriting them with metadata write errors:

		error = filemap_fdatawait(mapping);
		mapping_set_error(mapping, error);
	}
	error = filemap_fdatawait(metamapping);
	...
	return error;

So any errors returned by the inode write would be forgotten if the
metadata write succeeded. This patch still does both writes, but
only sets error if it's still zero. That way, any errors will be
reported by to the caller, do_xmote, which will take appropriate
action and report the error.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-06-02 19:45:05 +02:00
Mauro Carvalho Chehab
3700bec332 docs: filesystems: convert gfs2-glocks.txt to ReST
- Add a SPDX header;
- Adjust document and section titles;
- Some whitespace fixes and new line breaks;
- Add table markups;
- Use notes markups;
- Add it to filesystems/index.rst.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2020-06-02 19:45:05 +02:00
Linus Torvalds
17839856fd gup: document and work around "COW can break either way" issue
Doing a "get_user_pages()" on a copy-on-write page for reading can be
ambiguous: the page can be COW'ed at any time afterwards, and the
direction of a COW event isn't defined.

Yes, whoever writes to it will generally do the COW, but if the thread
that did the get_user_pages() unmapped the page before the write (and
that could happen due to memory pressure in addition to any outright
action), the writer could also just take over the old page instead.

End result: the get_user_pages() call might result in a page pointer
that is no longer associated with the original VM, and is associated
with - and controlled by - another VM having taken it over instead.

So when doing a get_user_pages() on a COW mapping, the only really safe
thing to do would be to break the COW when getting the page, even when
only getting it for reading.

At the same time, some users simply don't even care.

For example, the perf code wants to look up the page not because it
cares about the page, but because the code simply wants to look up the
physical address of the access for informational purposes, and doesn't
really care about races when a page might be unmapped and remapped
elsewhere.

This adds logic to force a COW event by setting FOLL_WRITE on any
copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page
pointer as a result.

The current semantics end up being:

 - __get_user_pages_fast(): no change. If you don't ask for a write,
   you won't break COW. You'd better know what you're doing.

 - get_user_pages_fast(): the fast-case "look it up in the page tables
   without anything getting mmap_sem" now refuses to follow a read-only
   page, since it might need COW breaking.  Which happens in the slow
   path - the fast path doesn't know if the memory might be COW or not.

 - get_user_pages() (including the slow-path fallback for gup_fast()):
   for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with
   very similar semantics to FOLL_FORCE.

If it turns out that we want finer granularity (ie "only break COW when
it might actually matter" - things like the zero page are special and
don't need to be broken) we might need to push these semantics deeper
into the lookup fault path.  So if people care enough, it's possible
that we might end up adding a new internal FOLL_BREAK_COW flag to go
with the internal FOLL_COW flag we already have for tracking "I had a
COW".

Alternatively, if it turns out that different callers might want to
explicitly control the forced COW break behavior, we might even want to
make such a flag visible to the users of get_user_pages() instead of
using the above default semantics.

But for now, this is mostly commentary on the issue (this commit message
being a lot bigger than the patch, and that patch in turn is almost all
comments), with that minimal "enable COW breaking early" logic using the
existing FOLL_WRITE behavior.

[ It might be worth noting that we've always had this ambiguity, and it
  could arguably be seen as a user-space issue.

  You only get private COW mappings that could break either way in
  situations where user space is doing cooperative things (ie fork()
  before an execve() etc), but it _is_ surprising and very subtle, and
  fork() is supposed to give you independent address spaces.

  So let's treat this as a kernel issue and make the semantics of
  get_user_pages() easier to understand. Note that obviously a true
  shared mapping will still get a page that can change under us, so this
  does _not_ mean that get_user_pages() somehow returns any "stable"
  page ]

Reported-by: Jann Horn <jannh@google.com>
Tested-by: Christoph Hellwig <hch@lst.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill Shutemov <kirill@shutemov.name>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-02 10:19:17 -07:00
Ashok Raj
3247bd10a4 PCI: Add ACS quirk for Intel Root Complex Integrated Endpoints
All Intel platforms guarantee that all root complex implementations must
send transactions up to IOMMU for address translations. Hence for Intel
RCiEP devices, we can assume some ACS-type isolation even without an ACS
capability.

From the Intel VT-d spec, r3.1, sec 3.16 ("Root-Complex Peer to Peer
Considerations"):

  When DMA remapping is enabled, peer-to-peer requests through the
  Root-Complex must be handled as follows:

  - The input address in the request is translated (through first-level,
    second-level or nested translation) to a host physical address (HPA).
    The address decoding for peer addresses must be done only on the
    translated HPA. Hardware implementations are free to further limit
    peer-to-peer accesses to specific host physical address regions (or
    to completely disallow peer-forwarding of translated requests).

  - Since address translation changes the contents (address field) of
    the PCI Express Transaction Layer Packet (TLP), for PCI Express
    peer-to-peer requests with ECRC, the Root-Complex hardware must use
    the new ECRC (re-computed with the translated address) if it
    decides to forward the TLP as a peer request.

  - Root-ports, and multi-function root-complex integrated endpoints, may
    support additional peer-to-peer control features by supporting PCI
    Express Access Control Services (ACS) capability. Refer to ACS
    capability in PCI Express specifications for details.

Since Linux didn't give special treatment to allow this exception, certain
RCiEP MFD devices were grouped in a single IOMMU group. This doesn't permit
a single device to be assigned to a guest for instance.

In one vendor system: Device 14.x were grouped in a single IOMMU group.

  /sys/kernel/iommu_groups/5/devices/0000:00:14.0
  /sys/kernel/iommu_groups/5/devices/0000:00:14.2
  /sys/kernel/iommu_groups/5/devices/0000:00:14.3

After this patch:

  /sys/kernel/iommu_groups/5/devices/0000:00:14.0
  /sys/kernel/iommu_groups/5/devices/0000:00:14.2
  /sys/kernel/iommu_groups/6/devices/0000:00:14.3 <<< new group

14.0 and 14.2 are integrated devices, but legacy end points, whereas 14.3
was a PCIe-compliant RCiEP.

  00:14.3 Network controller: Intel Corporation Device 9df0 (rev 30)
    Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00

This permits assigning this device to a guest VM.

[bhelgaas: drop "Fixes" tag since this doesn't fix a bug in that commit]
Link: https://lore.kernel.org/r/1590699462-7131-1-git-send-email-ashok.raj@intel.com
Tested-by: Darrel Goeddel <dgoeddel@forcepoint.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Mark Scott <mscott@forcepoint.com>,
Cc: Romil Sharma <rsharma@forcepoint.com>
2020-06-02 12:15:04 -05:00
Arnd Bergmann
d2353bad2c ARM: omap2: fix omap5_realtime_timer_init definition
There is one more regression introduced by the last build fix:

arch/arm/mach-omap2/timer.c:170:6: error: attribute declaration must precede definition [-Werror,-Wignored-attributes]
void __init omap5_realtime_timer_init(void)
     ^
arch/arm/mach-omap2/common.h:118:20: note: previous definition is here
static inline void omap5_realtime_timer_init(void)
                   ^
arch/arm/mach-omap2/timer.c:170:13: error: redefinition of 'omap5_realtime_timer_init'
void __init omap5_realtime_timer_init(void)
            ^
arch/arm/mach-omap2/common.h:118:20: note: previous definition is here
static inline void omap5_realtime_timer_init(void)

Address this by removing the now obsolete #ifdefs in that file and
just building the entire file based on the flag that controls the
omap5_realtime_timer_init function declaration.

Link: https://lore.kernel.org/r/20200529201701.521933-1-arnd@arndb.de
Fixes: d86ad463d6 ("ARM: OMAP2+: Fix regression for using local timer on non-SMP SoCs")
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-06-02 19:14:21 +02:00
Arnd Bergmann
603986a7a4 Merge tag 'keystone_dts_for_5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone into arm/dt
ARM: Keystone DTS updates for 5.7

Add display support for K2G EVM Board

* tag 'keystone_dts_for_5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
  ARM: dts: keystone-k2g-evm: add HDMI video support
  ARM: dts: keystone-k2g: Add DSS node

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2020-06-02 19:12:35 +02:00
Masami Hiramatsu
382561a0f1 selftests/sysctl: Make sysctl test driver as a module
Fix config file to require CONFIG_TEST_SYSCTL=m instead of y
because this driver introduces a test sysctl interfaces which
are normally not used, and only used for the selftest.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-06-02 10:27:02 -06:00
Masami Hiramatsu
eee470e073 selftests/sysctl: Fix to load test_sysctl module
Fix to load test_sysctl.ko module correctly.

sysctl.sh checks whether the test module is embedded (or loaded
already) or not at first, and if not, it returns skip error
instead of trying modprobe. Thus, there is no chance to load the
test_sysctl test module.

Instead, this removes that module embedded check and returns
skip error only if it ensures that there is no embedded test
module *and* no loadable test module.

This also avoid referring config file since that is not
installed.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-06-02 10:26:46 -06:00
Tony Lindgren
0df12a01f4 ARM: dts: omap4-droid4: Fix spi configuration and increase rate
We can currently sometimes get "RXS timed out" errors and "EOT timed out"
errors with spi transfers.

These errors can be made easy to reproduce by reading the cpcap iio
values in a loop while keeping the CPUs busy by also reading /dev/urandom.

The "RXS timed out" errors we can fix by adding spi-cpol and spi-cpha
in addition to the spi-cs-high property we already have.

The "EOT timed out" errors we can fix by increasing the spi clock rate
to 9.6 MHz. Looks similar MC13783 PMIC says it works at spi clock rates
up to 20 MHz, so let's assume we can pick any rate up to 20 MHz also
for cpcap.

Cc: maemo-leste@lists.dyne.org
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2020-06-02 09:26:25 -07:00
Masami Hiramatsu
2f56f84511 lib: Make test_sysctl initialized as module
test_sysctl.c is expected to be used as a module, but since
it does not use module_init(), it never be registered as
a module and not appeared under /sys/module/.
In the result, the selftests/sysctl/sysctl.sh always fails
to find the test module and is skipped.

This makes test_sysctl.c initialized as a module by module_init()
and allow sysctl.sh to find the test module is loaded.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-06-02 10:26:14 -06:00
Masami Hiramatsu
d0676871fd lib: Make prime number generator independently selectable
Make prime number generator independently selectable from
kconfig. This allows us to enable CONFIG_PRIME_NUMBERS=m
and run the tools/testing/selftests/lib/prime_numbers.sh
without other DRM selftest modules.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-06-02 10:25:20 -06:00
David Howells
b6f61c3146 keys: Implement update for the big_key type
Implement the ->update op for the big_key type.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
2020-06-02 17:22:31 +01:00
Jason A. Donenfeld
521fd61c84 security/keys: rewrite big_key crypto to use library interface
A while back, I noticed that the crypto and crypto API usage in big_keys
were entirely broken in multiple ways, so I rewrote it. Now, I'm
rewriting it again, but this time using the simpler ChaCha20Poly1305
library function. This makes the file considerably more simple; the
diffstat alone should justify this commit. It also should be faster,
since it no longer requires a mutex around the "aead api object" (nor
allocations), allowing us to encrypt multiple items in parallel. We also
benefit from being able to pass any type of pointer, so we can get rid
of the ridiculously complex custom page allocator that big_key really
doesn't need.

[DH: Change the select CRYPTO_LIB_CHACHA20POLY1305 to a depends on as
 select doesn't propagate and the build can end up with an =y dependending
 on some =m pieces.

 The depends on CRYPTO also had to be removed otherwise the configurator
 complains about a recursive dependency.]

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David Howells <dhowells@redhat.com>
2020-06-02 17:22:31 +01:00
Gustavo A. R. Silva
2ce113fa52 KEYS: Replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2020-06-02 17:22:31 +01:00
Ben Boeckel
352780b61e Documentation: security: core.rst: add missing argument
This argument was just never documented in the first place.

Signed-off-by: Ben Boeckel <mathstuf@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2020-06-02 17:22:31 +01:00
yangerkun
5ef1596813 locks: add locks_move_blocks in posix_lock_inode
We forget to call locks_move_blocks in posix_lock_inode when try to
process same owner and different types.

Signed-off-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
2020-06-02 12:08:25 -04:00
Kefeng Wang
b3e2d20973 rcuperf: Fix printk format warning
Using "%zu" to fix following warning,
kernel/rcu/rcuperf.c: In function ‘kfree_perf_init’:
include/linux/kern_levels.h:5:18: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘unsigned int’ [-Wformat=]

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-06-02 08:41:37 -07:00
Vivek Kasireddy
d161306161 drm/i915/dsi: Dont forget to clean up the connector on error (v2)
If an error is encountered during the DSI initialization setup, the
drm connector object also needs to be cleaned up along with the encoder.
The error can happen due to a missing mode in the VBT or for other
reasons.

v2: Rephrase the commit message to make it more clear.

Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200522202630.7604-1-vivek.kasireddy@intel.com
2020-06-02 08:34:57 -07:00
Sebastian Reichel
972eabb97a Revert "power: supply: sbs-battery: simplify read_read_string_data"
The commit is a nice cleanup, but breaks booting on exynos5 based
chromebooks. It's seems to come down to exynos5's i2c driver not
implementing I2C_FUNC_SMBUS_READ_BLOCK_DATA. It's not yet clear
why that breaks boot / massively slows it down when userspace
starts, so revert the problematic patch.

This reverts commit c4b12a2f3f.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2020-06-02 17:08:33 +02:00
Sebastian Reichel
cf1eb321d1 Revert "power: supply: sbs-battery: add PEC support"
This depends on the simplification of sbs_read_string_data, which
breaks booting exynos5 based chromebooks. More investigation is
required, so this patch and the simplification patch are reverted
for this merge window.

Note, that this is only a partial revert, since sbs_update_presence()
has not been removed. It is also required for the charger broadcast
disabling.

This reverts commit 79bcd5a4a6.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2020-06-02 16:59:08 +02:00
Aurelien Aptel
5f68ea4aa9 cifs: multichannel: move channel selection in function
This commit moves channel picking code in separate function.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-06-02 09:58:41 -05:00
Steve French
bbbf9eafbf cifs: fix minor typos in comments and log messages
Fix four minor typos in comments and log messages

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2020-06-02 09:58:30 -05:00
Steve French
3563a6f468 smb3: minor update to compression header definitions
MS-SMB2 specification was updated in March.  Make minor additions
and corrections to compression related definitions in smb2pdu.h

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2020-06-02 09:58:17 -05:00
Wei Li
c893de12e1 kdb: Remove the misfeature 'KDBFLAGS'
Currently, 'KDBFLAGS' is an internal variable of kdb, it is combined
by 'KDBDEBUG' and state flags. It will be shown only when 'KDBDEBUG'
is set, and the user can define an environment variable named 'KDBFLAGS'
too. These are puzzling indeed.

After communication with Daniel, it seems that 'KDBFLAGS' is a misfeature.
So let's replace 'KDBFLAGS' with 'KDBDEBUG' to just show the value we
wrote into. After this modification, we can use `md4c1 kdb_flags` instead,
to observe the state flags.

Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/r/20200521072125.21103-1-liwei391@huawei.com
[daniel.thompson@linaro.org: Make kdb_flags unsigned to avoid arithmetic
right shift]
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-06-02 15:15:46 +01:00
Douglas Anderson
1b310030bb kdb: Cleanup math with KDB_CMD_HISTORY_COUNT
From code inspection the math in handle_ctrl_cmd() looks super sketchy
because it subjects -1 from cmdptr and then does a "%
KDB_CMD_HISTORY_COUNT".  It turns out that this code works because
"cmdptr" is unsigned and KDB_CMD_HISTORY_COUNT is a nice power of 2.
Let's make this a little less sketchy.

This patch should be a no-op.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200507161125.1.I2cce9ac66e141230c3644b8174b6c15d4e769232@changeid
Reviewed-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-06-02 15:15:46 +01:00
Sumit Garg
195867ffea serial: amba-pl011: Support kgdboc_earlycon
Implement the read() function in the early console driver. With
recently added kgdboc_earlycon feature, this allows you to use kgdb
to debug fairly early into the system boot.

We only bother implementing this if polling is enabled since kgdb can't
be enabled without that.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200507130644.v4.12.I8ee0811f0e0816dd8bfe7f2f5540b3dba074fae8@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-06-02 15:15:46 +01:00
Douglas Anderson
c5e7467d92 serial: 8250_early: Support kgdboc_earlycon
Implement the read() function in the early console driver.  With
recent kgdb patches this allows you to use kgdb to debug fairly early
into the system boot.

We only bother implementing this if polling is enabled since kgdb
can't be enabled without that.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200507130644.v4.11.I8f668556c244776523320a95b09373a86eda11b7@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-06-02 15:15:46 +01:00
Douglas Anderson
205b5bdda2 serial: qcom_geni_serial: Support kgdboc_earlycon
Implement the read() function in the early console driver.  With
recent kgdb patches this allows you to use kgdb to debug fairly early
into the system boot.

We only bother implementing this if polling is enabled since kgdb
can't be enabled without that.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20200507130644.v4.10.If2deff9679a62c1ce1b8f2558a8635dc837adf8c@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-06-02 15:15:46 +01:00
Daniel Thompson
a4912303ac serial: kgdboc: Allow earlycon initialization to be deferred
Currently there is no guarantee that an earlycon will be initialized
before kgdboc tries to adopt it. Almost the opposite: on systems
with ACPI then if earlycon has no arguments then it is guaranteed that
earlycon will not be initialized.

This patch mitigates the problem by giving kgdboc_earlycon a second
chance during console_init(). This isn't quite as good as stopping during
early parameter parsing but it is still early in the kernel boot.

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200430161741.1832050-1-daniel.thompson@linaro.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>
2020-06-02 15:15:45 +01:00
Douglas Anderson
f71fc3bc7b Documentation: kgdboc: Document new kgdboc_earlycon parameter
The recent patch ("kgdboc: Add kgdboc_earlycon to support early kgdb
using boot consoles") adds a new kernel command line parameter.
Document it.

Note that the patch adding the feature does some comparing/contrasting
of "kgdboc_earlycon" vs. the existing "ekgdboc".  See that patch for
more details, but briefly "ekgdboc" can be used _instead_ of "kgdboc"
and just makes "kgdboc" do its normal initialization early (only works
if your tty driver is already ready).  The new "kgdboc_earlycon" works
in combination with "kgdboc" and is backed by boot consoles.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200507130644.v4.9.I7d5eb42c6180c831d47aef1af44d0b8be3fac559@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-06-02 15:15:45 +01:00
Douglas Anderson
b1350132fe kgdb: Don't call the deinit under spinlock
When I combined kgdboc_earlycon with an inflight patch titled ("soc:
qcom-geni-se: Add interconnect support to fix earlycon crash") [1]
things went boom.  Specifically I got a crash during the transition
between kgdboc_earlycon and the main kgdboc that looked like this:

Call trace:
 __schedule_bug+0x68/0x6c
 __schedule+0x75c/0x924
 schedule+0x8c/0xbc
 schedule_timeout+0x9c/0xfc
 do_wait_for_common+0xd0/0x160
 wait_for_completion_timeout+0x54/0x74
 rpmh_write_batch+0x1fc/0x23c
 qcom_icc_bcm_voter_commit+0x1b4/0x388
 qcom_icc_set+0x2c/0x3c
 apply_constraints+0x5c/0x98
 icc_set_bw+0x204/0x3bc
 icc_put+0x30/0xf8
 geni_remove_earlycon_icc_vote+0x6c/0x9c
 qcom_geni_serial_earlycon_exit+0x10/0x1c
 kgdboc_earlycon_deinit+0x38/0x58
 kgdb_register_io_module+0x11c/0x194
 configure_kgdboc+0x108/0x174
 kgdboc_probe+0x38/0x60
 platform_drv_probe+0x90/0xb0
 really_probe+0x130/0x2fc
 ...

The problem was that we were holding the "kgdb_registration_lock"
while calling into code that didn't expect to be called in spinlock
context.

Let's slightly defer when we call the deinit code so that it's not
done under spinlock.

NOTE: this does mean that the "deinit" call of the old kgdb IO module
is now made _after_ the init of the new IO module, but presumably
that's OK.

[1] https://lkml.kernel.org/r/1588919619-21355-3-git-send-email-akashast@codeaurora.org

Fixes: 220995622d ("kgdboc: Add kgdboc_earlycon to support early kgdb using boot consoles")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200526142001.1.I523dc33f96589cb9956f5679976d402c8cda36fa@changeid
[daniel.thompson@linaro.org: Resolved merge issues by hand]
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-06-02 15:15:45 +01:00
Douglas Anderson
1feb48baf2 kgdboc: Disable all the early code when kgdboc is a module
When kgdboc is compiled as a module all of the "ekgdboc" and
"kgdb_earlycon" code isn't useful and, in fact, breaks compilation.
This is because early_param() isn't defined for modules and that's how
this code gets configured.

It turns out that this was broken by commit eae3e19ca9 ("kgdboc:
Remove useless #ifdef CONFIG_KGDB_SERIAL_CONSOLE in kgdboc") and then
made worse by commit 220995622d ("kgdboc: Add kgdboc_earlycon to
support early kgdb using boot consoles").  I guess the #ifdef wasn't
so useless, even if it wasn't obvious why it was useful.  When kgdboc
was compiled as a module only "CONFIG_KGDB_SERIAL_CONSOLE_MODULE" was
defined, not "CONFIG_KGDB_SERIAL_CONSOLE".  That meant that the old
module.

Let's basically do the same thing that the old code (pre-removal of
the #ifdef) did but use "IS_BUILTIN(CONFIG_KGDB_SERIAL_CONSOLE)" to
make it more obvious what the point of the check is.  We'll fix
kgdboc_earlycon in a similar way.

Fixes: 220995622d ("kgdboc: Add kgdboc_earlycon to support early kgdb using boot consoles")
Fixes: eae3e19ca9 ("kgdboc: Remove useless #ifdef CONFIG_KGDB_SERIAL_CONSOLE in kgdboc")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200519084345.1.I91670accc8a5ddabab227eb63bb4ad3e2e9d2b58@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2020-06-02 15:15:45 +01:00
Tiezhu Yang
3e9b26dc22 perf tools: Remove some duplicated includes
There exists some duplicated includes in tools/perf, remove them.

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: xuefeng li <lixuefeng@loongson.cn>
Link: http://lore.kernel.org/lkml/1591071304-19338-2-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-02 11:09:41 -03:00
Adrian Hunter
0affd0e526 perf symbols: Fix kernel maps for kcore and eBPF
Adjust 'map->pgoff' also when moving a map's start address.

Example with v5.4.34 based kernel:

  Before:

    $ sudo tools/perf/perf record -a --kcore -e intel_pt//k sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 1.958 MB perf.data ]
    $ sudo tools/perf/perf script --itrace=e >/dev/null
    Warning:
    961 instruction trace errors

  After:

    $ sudo tools/perf/perf script --itrace=e >/dev/null
    $

Committer testing:

  # uname -a
  Linux seventh 5.6.10-100.fc30.x86_64 #1 SMP Mon May 4 15:36:44 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  #

Before:

  # perf record -a --kcore -e intel_pt//k sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.923 MB perf.data ]
  # perf script --itrace=e >/dev/null
  Warning:
  295 instruction trace errors
  #

After:

  # perf record -a --kcore -e intel_pt//k sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.919 MB perf.data ]
  # perf script --itrace=e >/dev/null
  #

Fixes: fb5a88d413 ("perf tools: Preserve eBPF maps when loading kcore")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20200602112505.1406-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-02 11:05:37 -03:00
Arnaldo Carvalho de Melo
3b1f47d6e7 tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  5cde265384 ("perf/x86/rapl: Add AMD Fam17h RAPL support")

Addressing this tools/perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

With this one will be able to use these new AMD MSRs in filters, by
name, e.g.:

   # perf trace -e msr:* --filter="msr==AMD_PKG_ENERGY_STATUS || msr==AMD_RAPL_POWER_UNIT"

Just like it is now possible with other MSRs:

  [root@five ~]# uname -a
  Linux five 5.5.17-200.fc31.x86_64 #1 SMP Mon Apr 13 15:29:42 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  [root@five ~]# grep 'model name' -m1 /proc/cpuinfo
  model name	: AMD Ryzen 5 3600X 6-Core Processor
  [root@five ~]#
  [root@five ~]# perf trace -e msr:*/max-stack=16/ --filter="msr==AMD_PERF_CTL" --max-events=2
       0.000 kworker/1:1-ev/2327824 msr:write_msr(msr: AMD_PERF_CTL, val: 2)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         [0xffffffffc01d71c3] ([acpi_cpufreq])
                                         [0] ([unknown])
                                         __cpufreq_driver_target ([kernel.kallsyms])
                                         od_dbs_update ([kernel.kallsyms])
                                         dbs_work_handler ([kernel.kallsyms])
                                         process_one_work ([kernel.kallsyms])
                                         worker_thread ([kernel.kallsyms])
                                         kthread ([kernel.kallsyms])
                                         ret_from_fork ([kernel.kallsyms])
       8.597 kworker/2:2-ev/2338099 msr:write_msr(msr: AMD_PERF_CTL, val: 2)
                                         do_trace_write_msr ([kernel.kallsyms])
                                         do_trace_write_msr ([kernel.kallsyms])
                                         [0] ([unknown])
                                         [0] ([unknown])
                                         __cpufreq_driver_target ([kernel.kallsyms])
                                         od_dbs_update ([kernel.kallsyms])
                                         dbs_work_handler ([kernel.kallsyms])
                                         process_one_work ([kernel.kallsyms])
                                         worker_thread ([kernel.kallsyms])
                                         kthread ([kernel.kallsyms])
                                         ret_from_fork ([kernel.kallsyms])
  [root@five ~]#

Longer explanation with what happens in the perf build process,
automatically after this is made in synch with the kernel sources:

  $ make -C tools/perf O=/tmp/build/perf install-bin
  <SNIP>
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
  <SNIP>
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $
  $ diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
  --- tools/arch/x86/include/asm/msr-index.h	2020-06-02 10:46:36.217782288 -0300
  +++ arch/x86/include/asm/msr-index.h	2020-05-28 10:41:23.313794627 -0300
  @@ -301,6 +301,9 @@
   #define MSR_PP1_ENERGY_STATUS		0x00000641
   #define MSR_PP1_POLICY			0x00000642

  +#define MSR_AMD_PKG_ENERGY_STATUS	0xc001029b
  +#define MSR_AMD_RAPL_POWER_UNIT		0xc0010299
  +
   /* Config TDP MSRs */
   #define MSR_CONFIG_TDP_NOMINAL		0x00000648
   #define MSR_CONFIG_TDP_LEVEL_1		0x00000649
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $
  $ make -C tools/perf O=/tmp/build/perf install-bin
  <SNIP>
    CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
    LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
    LD       /tmp/build/perf/trace/beauty/perf-in.o
    LD       /tmp/build/perf/perf-in.o
    LINK     /tmp/build/perf/perf
  <SNIP>
  make: Leaving directory '/home/acme/git/perf/tools/perf'
  $
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2020-06-02 10:47:08.486334348 -0300
  +++ after	2020-06-02 10:47:33.075008948 -0300
  @@ -286,6 +286,8 @@
   	[0xc0010240 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTL",
   	[0xc0010241 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTR",
   	[0xc0010280 - x86_AMD_V_KVM_MSRs_offset] = "F15H_PTSC",
  +	[0xc0010299 - x86_AMD_V_KVM_MSRs_offset] = "AMD_RAPL_POWER_UNIT",
  +	[0xc001029b - x86_AMD_V_KVM_MSRs_offset] = "AMD_PKG_ENERGY_STATUS",
   	[0xc00102f0 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN_CTL",
   	[0xc00102f1 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN",
   };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-06-02 10:57:59 -03:00
Thierry Reding
d741cb045c MAINTAINERS: Add Lee Jones as reviewer for the PWM subsystem
Lee has kindly offered his help in sharing the patch review workload for
the PWM subsystem. If this works out well between Lee, Uwe and myself it
may be a good idea to maintain the subsystem as a group.

Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2020-06-02 15:50:53 +02:00
Uwe Kleine-König
aef1a3799b pwm: imx27: Fix rounding behavior
To not trigger the warnings provided by CONFIG_PWM_DEBUG

 - use up-rounding in .get_state()
 - don't divide by the result of a division
 - don't use the rounded counter value for the period length to calculate
   the counter value for the duty cycle

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2020-06-02 15:50:52 +02:00
Rasmus Villemoes
cad0f29606 pwm: rockchip: Simplify rockchip_pwm_get_state()
The way state->enabled is computed is rather convoluted and hard to
read - both branches of the if() actually do the exact same thing. So
remove the if(), and further simplify "<boolean condition> ? true :
false" to "<boolean condition>".

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2020-06-02 15:50:52 +02:00
Navid Emamdoost
ca162ce981 pwm: img: Call pm_runtime_put() in pm_runtime_get_sync() failed case
Even in failed case of pm_runtime_get_sync(), the usage_count is
incremented. In order to keep the usage_count with correct value call
appropriate pm_runtime_put().

Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2020-06-02 15:50:51 +02:00