Commit Graph

948892 Commits

Author SHA1 Message Date
Guo Ren
18c07d23da csky: Fixup calltrace panic
The implementation of show_stack will panic with wrong fp:

addr    = *fp++;

because the fp isn't checked properly.

The current implementations of show_stack, wchan and stack_trace
haven't been designed properly, so just deprecate them.

This patch is a reference to riscv's way, all codes are modified from
arm's. The patch is passed with:

 - cat /proc/<pid>/stack
 - cat /proc/<pid>/wchan
 - echo c > /proc/sysrq-trigger

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-13 17:55:06 +08:00
Mao Han
229a0ddee1 csky: Fixup perf callchain unwind
[ 5221.974084] Unable to handle kernel paging request at virtual address 0xfffff000, pc: 0x8002c18e
 [ 5221.985929] Oops: 00000000
 [ 5221.989488]
 [ 5221.989488] CURRENT PROCESS:
 [ 5221.989488]
 [ 5221.992877] COMM=callchain_test PID=11962
 [ 5221.995213] TEXT=00008000-000087e0 DATA=00009f1c-0000a018 BSS=0000a018-0000b000
 [ 5221.999037] USER-STACK=7fc18e20  KERNEL-STACK=be204680
 [ 5221.999037]
 [ 5222.003292] PC: 0x8002c18e (perf_callchain_kernel+0x3e/0xd4)
 [ 5222.007957] LR: 0x8002c198 (perf_callchain_kernel+0x48/0xd4)
 [ 5222.074873] Call Trace:
 [ 5222.074873] [<800a248e>] get_perf_callchain+0x20a/0x29c
 [ 5222.074873] [<8009d964>] perf_callchain+0x64/0x80
 [ 5222.074873] [<8009dc1c>] perf_prepare_sample+0x29c/0x4b8
 [ 5222.074873] [<8009de6e>] perf_event_output_forward+0x36/0x98
 [ 5222.074873] [<800497e0>] search_exception_tables+0x20/0x44
 [ 5222.074873] [<8002cbb6>] do_page_fault+0x92/0x378
 [ 5222.074873] [<80098608>] __perf_event_overflow+0x54/0xdc
 [ 5222.074873] [<80098778>] perf_swevent_hrtimer+0xe8/0x164
 [ 5222.074873] [<8002ddd0>] update_mmu_cache+0x0/0xd8
 [ 5222.074873] [<8002c014>] user_backtrace+0x58/0xc4
 [ 5222.074873] [<8002c0b4>] perf_callchain_user+0x34/0xd0
 [ 5222.074873] [<800a2442>] get_perf_callchain+0x1be/0x29c
 [ 5222.074873] [<8009d964>] perf_callchain+0x64/0x80
 [ 5222.074873] [<8009d834>] perf_output_sample+0x78c/0x858
 [ 5222.074873] [<8009dc1c>] perf_prepare_sample+0x29c/0x4b8
 [ 5222.074873] [<8009de94>] perf_event_output_forward+0x5c/0x98
 [ 5222.097846]
 [ 5222.097846] [<800a0300>] perf_event_exit_task+0x58/0x43c
 [ 5222.097846] [<8006c874>] hrtimer_interrupt+0x104/0x2ec
 [ 5222.097846] [<800a0300>] perf_event_exit_task+0x58/0x43c
 [ 5222.097846] [<80437bb6>] dw_apb_clockevent_irq+0x2a/0x4c
 [ 5222.097846] [<8006c770>] hrtimer_interrupt+0x0/0x2ec
 [ 5222.097846] [<8005f2e4>] __handle_irq_event_percpu+0xac/0x19c
 [ 5222.097846] [<80437bb6>] dw_apb_clockevent_irq+0x2a/0x4c
 [ 5222.097846] [<8005f408>] handle_irq_event_percpu+0x34/0x88
 [ 5222.097846] [<8005f480>] handle_irq_event+0x24/0x64
 [ 5222.097846] [<8006218c>] handle_level_irq+0x68/0xdc
 [ 5222.097846] [<8005ec76>] __handle_domain_irq+0x56/0xa8
 [ 5222.097846] [<80450e90>] ck_irq_handler+0xac/0xe4
 [ 5222.097846] [<80029012>] csky_do_IRQ+0x12/0x24
 [ 5222.097846] [<8002a3a0>] csky_irq+0x70/0x80
 [ 5222.097846] [<800ca612>] alloc_set_pte+0xd2/0x238
 [ 5222.097846] [<8002ddd0>] update_mmu_cache+0x0/0xd8
 [ 5222.097846] [<800a0340>] perf_event_exit_task+0x98/0x43c

The original fp check doesn't base on the real kernal stack region.
Invalid fp address may cause kernel panic.

Signed-off-by: Mao Han <han_mao@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-13 17:55:05 +08:00
Liu Yibin
165f2d2858 csky: Fixup msa highest 3 bits mask
Just as comment mentioned, the msa format:

 cr<30/31, 15> MSA register format:
 31 - 29 | 28 - 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
   BA     Reserved  SH  WA  B   SO SEC  C   D   V

So we should shift 29 bits not 28 bits for mask

Signed-off-by: Liu Yibin <jiulong@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-13 17:55:05 +08:00
Guo Ren
c2e59d1f4d csky: Fixup perf probe -x hungup
case:
 # perf probe -x /lib/libc-2.28.9000.so memcpy
 # perf record -e probe_libc:memcpy -aR sleep 1

System hangup and cpu get in trap_c loop, because our hardware
singlestep state could still get interrupt signal. When we get in
uprobe_xol singlestep slot, we should disable irq in pt_regs->psr.

And is_swbp_insn() need a csky arch implementation with a low 16bit
mask.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-05-13 17:55:05 +08:00
Guo Ren
bd11aabd35 csky: Fixup compile error for abiv1 entry.S
This bug is from uprobe signal definition in thread_info.h. The
instruction (andi) of abiv1 immediate is smaller than abiv2, then
it will cause:

  AS      arch/csky/kernel/entry.o
 arch/csky/kernel/entry.S: Assembler messages:
 arch/csky/kernel/entry.S:224: Error: Operand 2 immediate is overflow.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-13 17:55:05 +08:00
Guo Ren
a13d5887ff csky/ftrace: Fixup error when disable CONFIG_DYNAMIC_FTRACE
When CONFIG_DYNAMIC_FTRACE is enabled, static ftrace will fail to
boot up and compile. It's a carelessness when developing "dynamic
ftrace" and "ftrace with regs".

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
2020-05-13 17:55:05 +08:00
Nicolas Saenz Julienne
c65822fef4 USB: pci-quirks: Add Raspberry Pi 4 quirk
On the Raspberry Pi 4, after a PCI reset, VL805's firmware may either be
loaded directly from an EEPROM or, if not present, by the SoC's
VideoCore. Inform VideoCore that VL805 was just reset.

Also, as this creates a dependency between USB_PCI and VideoCore's
firmware interface, and since USB_PCI can't be set as a module neither
this can. Reflect that on the firmware interface Kconfg.

Link: https://lore.kernel.org/r/20200505161318.26200-5-nsaenzjulienne@suse.de
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
2020-05-13 10:53:23 +01:00
Nicolas Saenz Julienne
44331189f9 PCI: brcmstb: Wait for Raspberry Pi's firmware when present
xHCI's PCI fixup, run at the end of pcie-brcmstb's probe, depends on
RPi4's VideoCore firmware interface to be up and running. It's possible
for both initializations to race, so make sure it's available prior to
starting.

Link: https://lore.kernel.org/r/20200505161318.26200-4-nsaenzjulienne@suse.de
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
2020-05-13 10:53:23 +01:00
Nicolas Saenz Julienne
fbbc5ff3f7 firmware: raspberrypi: Introduce vl805 init routine
The Raspberry Pi 4 gets its USB functionality from VL805, a PCIe chip
that implements xHCI. After a PCI reset, VL805's firmware may either be
loaded directly from an EEPROM or, if not present, by the SoC's
co-processor, VideoCore. RPi4's VideoCore OS contains both the non public
firmware load logic and the VL805 firmware blob. The function this patch
introduces triggers the aforementioned process.

Link: https://lore.kernel.org/r/20200505161318.26200-3-nsaenzjulienne@suse.de
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Rob Herring <robh@kernel.org>
2020-05-13 10:53:23 +01:00
Nicolas Saenz Julienne
ca91ddef2e soc: bcm2835: Add notify xHCI reset property
The property is needed in order to trigger VL805's firmware load. Note
that gap between the property introduced and the previous one is due to
the properties not being defined.

Link: https://lore.kernel.org/r/20200505161318.26200-2-nsaenzjulienne@suse.de
Signed-off-by: Nicolas Saenz Julienne <nsaenzjulienne@suse.de>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
2020-05-13 10:53:23 +01:00
Christian Brauner
2b40c5db73 selftests/pidfd: add pidfd setns tests
This is basically a test-suite for setns() and as of now contains:
- test that we can't pass garbage flags
- test that we can't attach to the namespaces of  task that has already exited
- test that we can incrementally setns into all namespaces of a target task
  using a pidfd
- test that we can setns atomically into all namespaces of a target task
- test that we can't cross setns into a user namespace outside of our user
  namespace hierarchy
- test that we can't setns into namespaces owned by user namespaces over which
  we are not privileged

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Link: https://lore.kernel.org/r/20200505140432.181565-4-christian.brauner@ubuntu.com
2020-05-13 11:41:22 +02:00
Christian Brauner
303cc571d1 nsproxy: attach to namespaces via pidfds
For quite a while we have been thinking about using pidfds to attach to
namespaces. This patchset has existed for about a year already but we've
wanted to wait to see how the general api would be received and adopted.
Now that more and more programs in userspace have started using pidfds
for process management it's time to send this one out.

This patch makes it possible to use pidfds to attach to the namespaces
of another process, i.e. they can be passed as the first argument to the
setns() syscall. When only a single namespace type is specified the
semantics are equivalent to passing an nsfd. That means
setns(nsfd, CLONE_NEWNET) equals setns(pidfd, CLONE_NEWNET). However,
when a pidfd is passed, multiple namespace flags can be specified in the
second setns() argument and setns() will attach the caller to all the
specified namespaces all at once or to none of them. Specifying 0 is not
valid together with a pidfd.

Here are just two obvious examples:
setns(pidfd, CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET);
setns(pidfd, CLONE_NEWUSER);
Allowing to also attach subsets of namespaces supports various use-cases
where callers setns to a subset of namespaces to retain privilege, perform
an action and then re-attach another subset of namespaces.

If the need arises, as Eric suggested, we can extend this patchset to
assume even more context than just attaching all namespaces. His suggestion
specifically was about assuming the process' root directory when
setns(pidfd, 0) or setns(pidfd, SETNS_PIDFD) is specified. For now, just
keep it flexible in terms of supporting subsets of namespaces but let's
wait until we have users asking for even more context to be assumed. At
that point we can add an extension.

The obvious example where this is useful is a standard container
manager interacting with a running container: pushing and pulling files
or directories, injecting mounts, attaching/execing any kind of process,
managing network devices all these operations require attaching to all
or at least multiple namespaces at the same time. Given that nowadays
most containers are spawned with all namespaces enabled we're currently
looking at at least 14 syscalls, 7 to open the /proc/<pid>/ns/<ns>
nsfds, another 7 to actually perform the namespace switch. With time
namespaces we're looking at about 16 syscalls.
(We could amortize the first 7 or 8 syscalls for opening the nsfds by
 stashing them in each container's monitor process but that would mean
 we need to send around those file descriptors through unix sockets
 everytime we want to interact with the container or keep on-disk
 state. Even in scenarios where a caller wants to join a particular
 namespace in a particular order callers still profit from batching
 other namespaces. That mostly applies to the user namespace but
 all container runtimes I found join the user namespace first no matter
 if it privileges or deprivileges the container similar to how unshare
 behaves.)
With pidfds this becomes a single syscall no matter how many namespaces
are supposed to be attached to.

A decently designed, large-scale container manager usually isn't the
parent of any of the containers it spawns so the containers don't die
when it crashes or needs to update or reinitialize. This means that
for the manager to interact with containers through pids is inherently
racy especially on systems where the maximum pid number is not
significicantly bumped. This is even more problematic since we often spawn
and manage thousands or ten-thousands of containers. Interacting with a
container through a pid thus can become risky quite quickly. Especially
since we allow for an administrator to enable advanced features such as
syscall interception where we're performing syscalls in lieu of the
container. In all of those cases we use pidfds if they are available and
we pass them around as stable references. Using them to setns() to the
target process' namespaces is as reliable as using nsfds. Either the
target process is already dead and we get ESRCH or we manage to attach
to its namespaces but we can't accidently attach to another process'
namespaces. So pidfds lend themselves to be used with this api.
The other main advantage is that with this change the pidfd becomes the
only relevant token for most container interactions and it's the only
token we need to create and send around.

Apart from significiantly reducing the number of syscalls from double
digit to single digit which is a decent reason post-spectre/meltdown
this also allows to switch to a set of namespaces atomically, i.e.
either attaching to all the specified namespaces succeeds or we fail. If
we fail we haven't changed a single namespace. There are currently three
namespaces that can fail (other than for ENOMEM which really is not
very interesting since we then have other problems anyway) for
non-trivial reasons, user, mount, and pid namespaces. We can fail to
attach to a pid namespace if it is not our current active pid namespace
or a descendant of it. We can fail to attach to a user namespace because
we are multi-threaded or because our current mount namespace shares
filesystem state with other tasks, or because we're trying to setns()
to the same user namespace, i.e. the target task has the same user
namespace as we do. We can fail to attach to a mount namespace because
it shares filesystem state with other tasks or because we fail to lookup
the new root for the new mount namespace. In most non-pathological
scenarios these issues can be somewhat mitigated. But there are cases where
we're half-attached to some namespace and failing to attach to another one.
I've talked about some of these problem during the hallway track (something
only the pre-COVID-19 generation will remember) of Plumbers in Los Angeles
in 2018(?). Even if all these issues could be avoided with super careful
userspace coding it would be nicer to have this done in-kernel. Pidfds seem
to lend themselves nicely for this.

The other neat thing about this is that setns() becomes an actual
counterpart to the namespace bits of unshare().

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Jann Horn <jannh@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Link: https://lore.kernel.org/r/20200505140432.181565-3-christian.brauner@ubuntu.com
2020-05-13 11:41:22 +02:00
Raul E Rangel
ea90228c7b iommu/amd: Fix get_acpihid_device_id()
acpi_dev_hid_uid_match() expects a null pointer for UID if it doesn't
exist. The acpihid_map_entry contains a char buffer for holding the
UID. If no UID was provided in the IVRS table, this buffer will be
zeroed. If we pass in a null string, acpi_dev_hid_uid_match() will
return false because it will try and match an empty string to the ACPI
UID of the device.

Fixes: ae5e6c6439 ("iommu/amd: Switch to use acpi_dev_hid_uid_match()")
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Raul E Rangel <rrangel@chromium.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20200511103229.v2.1.I6f1b6f973ee6c8af1348611370c73a0ec0ea53f1@changeid
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-13 11:14:41 +02:00
Alexander Monakov
e461b8c991 iommu/amd: Fix over-read of ACPI UID from IVRS table
IVRS parsing code always tries to read 255 bytes from memory when
retrieving ACPI device path, and makes an assumption that firmware
provides a zero-terminated string. Both of those are bugs: the entry
is likely to be shorter than 255 bytes, and zero-termination is not
guaranteed.

With Acer SF314-42 firmware these issues manifest visibly in dmesg:

AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR0\xf0\xa5, rdevid:160
AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR1\xf0\xa5, rdevid:160
AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR2\xf0\xa5, rdevid:160
AMD-Vi: ivrs, add hid:AMDI0020, uid:\_SB.FUR3>\x83e\x8d\x9a\xd1...

The first three lines show how the code over-reads adjacent table
entries into the UID, and in the last line it even reads garbage data
beyond the end of the IVRS table itself.

Since each entry has the length of the UID (uidl member of ivhd_entry
struct), use that for memcpy, and manually add a zero terminator.

Avoid zero-filling hid and uid arrays up front, and instead ensure
the uid array is always zero-terminated. No change needed for the hid
array, as it was already properly zero-terminated.

Fixes: 2a0cb4e2d4 ("iommu/amd: Add new map for storing IVHD dev entry type HID")

Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Link: https://lore.kernel.org/r/20200511102352.1831-1-amonakov@ispras.ru
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-13 11:13:26 +02:00
Lubos Dolezel
144da23bea ovl: return required buffer size for file handles
Overlayfs doesn't work well with the fanotify mechanism.

Fanotify first probes for the required buffer size for the file handle,
but overlayfs currently bails out without passing the size back.

That results in errors in the kernel log, such as:

[527944.485384] overlayfs: failed to encode file handle (/, err=-75, buflen=0, len=29, type=1)
[527944.485386] fanotify: failed to encode fid (fsid=ae521e68.a434d95f, type=255, bytes=0, err=-2)

Signed-off-by: Lubos Dolezel <lubos@dolezel.info>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-13 11:11:24 +02:00
Chengguang Xu
399c109d35 ovl: sync dirty data when remounting to ro mode
sync_filesystem() does not sync dirty data for readonly filesystem during
umount, so before changing to readonly filesystem we should sync dirty data
for data integrity.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-13 11:11:24 +02:00
Chengguang Xu
c21c839b84 ovl: whiteout inode sharing
Share inode with different whiteout files for saving inode and speeding up
delete operation.

If EMLINK is encountered when linking a shared whiteout, create a new one.
In case of any other error, disable sharing for this super block.

Note: ofs->whiteout is protected by inode lock on workdir.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-13 11:11:24 +02:00
Jeffle Xu
654255fa20 ovl: inherit SB_NOSEC flag from upperdir
Since the stacking of regular file operations [1], the overlayfs edition of
write_iter() is called when writing regular files.

Since then, xattr lookup is needed on every write since file_remove_privs()
is called from ovl_write_iter(), which would become the performance
bottleneck when writing small chunks of data. In my test case,
file_remove_privs() would consume ~15% CPU when running fstime of unixbench
(the workload is repeadly writing 1 KB to the same file) [2].

Inherit the SB_NOSEC flag from upperdir. Since then xattr lookup would be
done only once on the first write. Unixbench fstime gets a ~20% performance
gain with this patch.

[1] https://lore.kernel.org/lkml/20180606150905.GC9426@magnolia/T/
[2] https://www.spinics.net/lists/linux-unionfs/msg07153.html

Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-13 11:11:24 +02:00
Konstantin Khlebnikov
32b1924b21 ovl: skip overlayfs superblocks at global sync
Stacked filesystems like overlayfs has no own writeback, but they have to
forward syncfs() requests to backend for keeping data integrity.

During global sync() each overlayfs instance calls method ->sync_fs() for
backend although it itself is in global list of superblocks too.  As a
result one syscall sync() could write one superblock several times and send
multiple disk barriers.

This patch adds flag SB_I_SKIP_SYNC into sb->sb_iflags to avoid that.

Reported-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-13 11:11:24 +02:00
Amir Goldstein
62a8a85be8 ovl: index dir act as work dir
With index=on, let index dir act as the work dir for copy up and cleanups.
This will help implementing whiteout inode sharing.

We still create the "work" dir on mount regardless of index=on and it is
used to test the features supported by upper fs.  One reason is that before
the feature tests, we do not know if index could be enabled or not.

The reason we do not use "index" directory also as workdir with index=off
is because the existence of the "index" directory acts as a simple
persistent signal that index was enabled on this filesystem and tools may
want to use that signal.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-13 11:11:24 +02:00
Amir Goldstein
773cb4c56b ovl: prepare to copy up without workdir
With index=on, we copy up lower hardlinks to work dir and move them into
index dir. Fix locking to allow work dir and index dir to be the same
directory.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-13 11:11:24 +02:00
Amir Goldstein
3011645b5b ovl: cleanup non-empty directories in ovl_indexdir_cleanup()
Teach ovl_indexdir_cleanup() to remove temp directories containing
whiteouts to prepare for using index dir instead of work dir for removing
merge directories.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-13 11:11:24 +02:00
Amir Goldstein
b0def88d80 ovl: resolve more conflicting mount options
Similar to the way that a conflict between metacopy=on,redirect_dir=off is
resolved, also resolve conflicts between nfs_export=on,index=off and
nfs_export=on,metacopy=on.

An explicit mount option wins over a default config value.  Both explicit
mount options result in an error.

Without this change the xfstests group overlay/exportfs are skipped if
metacopy is enabled by default.

Reported-by: Chengguang Xu <cgxu519@mykernel.net>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-13 11:11:24 +02:00
Dan Carpenter
9aafc1b018 ovl: potential crash in ovl_fid_to_fh()
The "buflen" value comes from the user and there is a potential that it
could be zero.  In do_handle_to_path() we know that "handle->handle_bytes"
is non-zero and we do:

	handle_dwords = handle->handle_bytes >> 2;

So values 1-3 become zero.  Then in ovl_fh_to_dentry() we do:

	int len = fh_len << 2;

So now len is in the "0,4-128" range and a multiple of 4.  But if
"buflen" is zero it will try to copy negative bytes when we do the
memcpy in ovl_fid_to_fh().

	memcpy(&fh->fb, fid, buflen - OVL_FH_WIRE_OFFSET);

And that will lead to a crash.  Thanks to Amir Goldstein for his help
with this patch.

Fixes: cbe7fba8ed ("ovl: make sure that real fid is 32bit aligned in memory")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Cc: <stable@vger.kernel.org> # v5.5
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-05-13 11:10:57 +02:00
Fabio Estevam
e98ad55989 arm64: dts: imx8qxp-mek: Do not use underscore in node name
Underscores are not recommended to be used in node names, so change
the pinctrl IO expander node name.

This change also makes the pinctrl node names to follow the convention
used by other pinctrl group names.

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Reviewed-by: Abel Vesa <abel.vesa@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2020-05-13 16:56:45 +08:00
Andy Shevchenko
3a0ce12e3b iommu/iova: Unify format of the printed messages
Unify format of the printed messages, i.e. replace printk(LEVEL ... )
with pr_level(...).

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20200507161804.13275-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-13 10:54:51 +02:00
Arnd Bergmann
2ba20b5a5b iommu/renesas: Fix unused-function warning
gcc warns because the only reference to ipmmu_find_group
is inside of an #ifdef:

drivers/iommu/ipmmu-vmsa.c:878:28: error: 'ipmmu_find_group' defined but not used [-Werror=unused-function]

Change the #ifdef to an equivalent IS_ENABLED().

Fixes: 6580c8a784 ("iommu/renesas: Convert to probe/release_device() call-backs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <horms@verge.net.au>
Link: https://lore.kernel.org/r/20200508220224.688985-1-arnd@arndb.de
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-13 10:46:56 +02:00
Thierry Reding
f38338cf06 iommu: Do not probe devices on IOMMU-less busses
The host1x bus implemented on Tegra SoCs is primarily an abstraction to
create logical device from multiple platform devices. Since the devices
in such a setup are typically hierarchical, DMA setup still needs to be
done so that DMA masks can be properly inherited, but we don't actually
want to attach the host1x logical devices to any IOMMU. The platform
devices that make up the logical device are responsible for memory bus
transactions, so it is them that will need to be attached to the IOMMU.

Add a check to __iommu_probe_device() that aborts IOMMU setup early for
busses that don't have the IOMMU operations pointer set since they will
cause a crash otherwise.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20200511161000.3853342-1-thierry.reding@gmail.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-13 10:42:14 +02:00
Qian Cai
cfcccbe887 iommu/amd: Fix variable "iommu" set but not used
The commit dce8d6964e ("iommu/amd: Convert to probe/release_device()
call-backs") introduced an unused variable,

drivers/iommu/amd_iommu.c: In function 'amd_iommu_uninit_device':
drivers/iommu/amd_iommu.c:422:20: warning: variable 'iommu' set but not
used [-Wunused-but-set-variable]
  struct amd_iommu *iommu;
                    ^~~~~

Signed-off-by: Qian Cai <cai@lca.pw>
Link: https://lore.kernel.org/r/20200509015645.3236-1-cai@lca.pw
Fixes: dce8d6964e ("iommu/amd: Convert to probe/release_device() call-backs")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-05-13 10:39:43 +02:00
Yu-cheng Yu
524bb73bc1 x86/fpu/xstate: Separate user and supervisor xfeatures mask
Before the introduction of XSAVES supervisor states, 'xfeatures_mask' is
used at various places to determine XSAVE buffer components and XCR0 bits.
It contains only user xstates.  To support supervisor xstates, it is
necessary to separate user and supervisor xstates:

- First, change 'xfeatures_mask' to 'xfeatures_mask_all', which represents
  the full set of bits that should ever be set in a kernel XSAVE buffer.
- Introduce xfeatures_mask_supervisor() and xfeatures_mask_user() to
  extract relevant xfeatures from xfeatures_mask_all.

Co-developed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200512145444.15483-4-yu-cheng.yu@intel.com
2020-05-13 10:31:07 +02:00
Eelco Chaudron
fd9eef1a13 libbpf: Fix probe code to return EPERM if encountered
When the probe code was failing for any reason ENOTSUP was returned, even
if this was due to not having enough lock space. This patch fixes this by
returning EPERM to the user application, so it can respond and increase
the RLIMIT_MEMLOCK size.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/158927424896.2342.10402475603585742943.stgit@ebuild
2020-05-13 10:29:54 +02:00
Yauheni Kaliuta
309b81f0fd selftests/bpf: Install generated test progs
Before commit 74b5a5968f ("selftests/bpf: Replace test_progs and
test_maps w/ general rule") selftests/bpf used generic install
target from selftests/lib.mk to install generated bpf test progs
by mentioning them in TEST_GEN_FILES variable.

Take that functionality back.

Fixes: 74b5a5968f ("selftests/bpf: Replace test_progs and test_maps w/ general rule")
Signed-off-by: Yauheni Kaliuta <yauheni.kaliuta@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200513021722.7787-1-yauheni.kaliuta@redhat.com
2020-05-13 10:25:41 +02:00
Dong Aisheng
88d93afd77 dt-bindings: firmware: imx: Add more system controls and PM clock types
Add more system controls and PM clock types for usage.

Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2020-05-13 16:19:39 +08:00
Dong Aisheng
755a739794 dt-bindings: firmware: imx: Move system control into dt-binding headfile
i.MX8 SoCs DTS file needs system control macro definitions, so move them
into dt-binding headfile, then include/linux/firmware/imx/types.h can be
removed and those drivers using it should be changed accordingly.

Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Jacky Bai <ping.bai@nxp.com>
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2020-05-13 16:19:23 +08:00
Tim Harvey
957743b79b ARM: dts: imx6qdl-gw552x: add USB OTG support
The GW552x-B board revision adds USB OTG support.

Enable the device-tree node and configure the OTG_ID pin.

Signed-off-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2020-05-13 16:09:19 +08:00
Archie Pusaka
5b440676c1 Bluetooth: L2CAP: add support for waiting disconnection resp
Whenever we disconnect a L2CAP connection, we would immediately
report a disconnection event (EPOLLHUP) to the upper layer, without
waiting for the response of the other device.

This patch offers an option to wait until we receive a disconnection
response before reporting disconnection event, by using the "how"
parameter in l2cap_sock_shutdown(). Therefore, upper layer can opt
to wait for disconnection response by shutdown(sock, SHUT_WR).

This can be used to enforce proper disconnection order in HID,
where the disconnection of the interrupt channel must be complete
before attempting to disconnect the control channel.

Signed-off-by: Archie Pusaka <apusaka@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-05-13 10:03:51 +02:00
Sonny Sasaka
adf1d69264 Bluetooth: Handle Inquiry Cancel error after Inquiry Complete
After sending Inquiry Cancel command to the controller, it is possible
that Inquiry Complete event comes before Inquiry Cancel command complete
event. In this case the Inquiry Cancel command will have status of
Command Disallowed since there is no Inquiry session to be cancelled.
This case should not be treated as error, otherwise we can reach an
inconsistent state.

Example of a btmon trace when this happened:

< HCI Command: Inquiry Cancel (0x01|0x0002) plen 0
> HCI Event: Inquiry Complete (0x01) plen 1
        Status: Success (0x00)
> HCI Event: Command Complete (0x0e) plen 4
      Inquiry Cancel (0x01|0x0002) ncmd 1
        Status: Command Disallowed (0x0c)

Signed-off-by: Sonny Sasaka <sonnysasaka@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-05-13 09:35:17 +02:00
Rikard Falkeborn
608c39f430 Bluetooth: serdev: Constify serdev_device_ops
serdev_device_ops is not modified and can be const. Also, remove the
unneeded declaration of it.

Output from the file command before and after:

Before:
   text    data     bss     dec     hex filename
   7192    2408     192    9792    2640 drivers/bluetooth/hci_serdev.o

After:
   text    data     bss     dec     hex filename
   7256    2344     192    9792    2640 drivers/bluetooth/hci_serdev.o

Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-05-13 09:27:36 +02:00
Raghuram Hegde
875e167590 Bluetooth: btusb: Add support for Intel Bluetooth Device Typhoon Peak (8087:0032)
Device from /sys/kernel/debug/usb/devices:

T:  Bus=01 Lev=01 Prnt=01 Port=13 Cnt=02 Dev#=  3 Spd=12   MxCh= 0
D:  Ver= 2.01 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=8087 ProdID=0032 Rev= 0.00
C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=81(I) Atr=03(Int.) MxPS=  64 Ivl=1ms
E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
I:  If#= 1 Alt= 6 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
E:  Ad=03(O) Atr=01(Isoc) MxPS=  63 Ivl=1ms
E:  Ad=83(I) Atr=01(Isoc) MxPS=  63 Ivl=1ms

Signed-off-by: Raghuram Hegde <raghuram.hegde@intel.com>
Signed-off-by: Chethan T N <chethan.tumkur.narayan@intel.com>
Signed-off-by: Amit K Bag <amit.k.bag@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-05-13 09:14:29 +02:00
Abhishek Pandit-Subedi
b7d0bf11a7 Bluetooth: btusb: Implement hdev->prevent_wake
Implement the prevent_wake hook by checking device_may_wakeup on the usb
interface. This prevents the Bluetooth core from enabling scanning when
the device isn't expected to wake from suspend.

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Reviewed-by: Alain Michaud <alainm@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-05-13 09:12:04 +02:00
Abhishek Pandit-Subedi
81dafad53c Bluetooth: Add hook for driver to prevent wake from suspend
Let drivers have a hook to disable configuring scanning during suspend.
Drivers should use the device_may_wakeup function call to determine
whether hci should be configured for wakeup.

For example, an implementation for btusb may look like the following:

  bool btusb_prevent_wake(struct hci_dev *hdev)
  {
        struct btusb_data *data = hci_get_drvdata(hdev);
        return !device_may_wakeup(&data->udev->dev);
  }

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Reviewed-by: Alain Michaud <alainm@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-05-13 09:12:04 +02:00
Abhishek Pandit-Subedi
0d2c9825e4 Bluetooth: Rename BT_SUSPEND_COMPLETE
Renamed BT_SUSPEND_COMPLETE to BT_SUSPEND_CONFIGURE_WAKE since it sets
up the event filter and whitelist for wake-up.

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Reviewed-by: Alain Michaud <alainm@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-05-13 09:12:04 +02:00
Abhishek Pandit-Subedi
91779665c1 Bluetooth: Modify LE window and interval for suspend
When a device is suspended, it doesn't need to be as responsive to
connection events. Increase the interval to 640ms (creating a duty cycle
of roughly 1.75%) so that passive scanning uses much less power (vs
previous duty cycle of 18.75%). The new window + interval combination
has been tested to work with HID devices (which are currently the only
devices capable of wake up).

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-05-13 08:50:56 +02:00
Abhishek Pandit-Subedi
aaebf8e608 Bluetooth: Fix incorrect type for window and interval
The types for window and interval should be uint16, not uint8.

Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-05-13 08:50:56 +02:00
Saravana Kannan
6c591eec67 OPP: Add helpers for reading the binding properties
The opp-hz DT property is not mandatory and we may use another property
as a key in the OPP table. Add helper functions to simplify the reading
and comparing the keys.

Signed-off-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Reviewed-by: Sibi Sankar <sibis@codeaurora.org>
[ Viresh: Removed an unnecessary comment ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-05-13 11:48:13 +05:30
Saravana Kannan
45a41875fa dt-bindings: opp: Introduce opp-peak-kBps and opp-avg-kBps bindings
Interconnects often quantify their performance points in terms of
bandwidth. So, add opp-peak-kBps (required) and opp-avg-kBps (optional) to
allow specifying Bandwidth OPP tables in DT.

opp-peak-kBps is a required property that replaces opp-hz for Bandwidth OPP
tables.

opp-avg-kBps is an optional property that can be used in Bandwidth OPP
tables.

Signed-off-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
Reviewed-by: Sibi Sankar <sibis@codeaurora.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-05-13 11:46:11 +05:30
Stephen Boyd
bc8c945e0a clk: Move HAVE_CLK config out of architecture layer
The implementation of 'struct clk' is not really an architectual detail
anymore now that most architectures have migrated to the common clk
framework. To sway new architecture ports away from trying to implement
their own 'struct clk', move the config next to the common clk framework
config.

Cc: Russell King <linux@armlinux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lkml.kernel.org/r/20200409064416.83340-11-sboyd@kernel.org
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
2020-05-12 20:28:03 -07:00
Stephen Boyd
c7725c9b74 MIPS: Loongson64: Drop asm/clock.h include
This include isn't used by this file, so just remove it.

Acked-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Paul Burton <paulburton@kernel.org>
Cc: <linux-mips@vger.kernel.org>
Cc: <chenhc@lemote.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lkml.kernel.org/r/20200409064416.83340-10-sboyd@kernel.org
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
2020-05-12 20:28:03 -07:00
Stephen Boyd
3819ad4402 ARM: mmp: Remove legacy clk code
Remove all the legacy clk code that supports a non-common clk framework
implementation of 'struct clk' in mach-mmp. This code doesn't look to be
compiled anymore given that the MMP is fully supported in the
multi-platform config via ARCH_MULTIPLATFORM as of commit 377524dc4d
("ARM: mmp: move into ARCH_MULTIPLATFORM"). The ARCH_MULTIPLATFORM
config selects COMMON_CLK and therefore the Makefile rule can never
actually compile the code in these files.

Cc: Lubomir Rintel <lkundrak@v3.sk>
Cc: Russell King <linux@armlinux.org.uk>
Cc: <linux-arm-kernel@lists.infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lkml.kernel.org/r/20200409064416.83340-9-sboyd@kernel.org
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
[sboyd@kernel.org: Squash in a clock.h include removal found by Stephen
Rothwell <sfr@canb.auug.org.au>]
2020-05-12 20:28:03 -07:00
Tero Kristo
852049594b clk: ti: clkctrl: convert subclocks to use proper names also
Addition of the new internal API to get the clkctrl names missed adding
the same conversion in place for the subclocks. This leads into missed
parent/child relationships (i.e. orphaned clocks) with mixed node name
handling, for example with omap4/omap5 where the l4_per clocks are using
new naming, but rest are using old. Fix by converting the subclock
registration to pick correct names for the clocks also.

Fixes: 6c30905205 ("clk: ti: clkctrl: Fix hidden dependency to node name")
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Link: https://lkml.kernel.org/r/20200430083451.8562-1-t-kristo@ti.com
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2020-05-12 20:18:19 -07:00