Commit Graph

9835 Commits

Author SHA1 Message Date
Mark Zhang
7c97f3aded RDMA/counter: Add PID category support in auto mode
With the "PID" category QPs have same PID will be bound to same counter;
If this category is not set then QPs have different PIDs will be bound
to same counter.

This is implemented for 2 reasons:
1. The counter is a limited resource, while there may be dozens of
   applications, each of which creates several types of QPs, which means
   it may doesn't have enough counter.
2. The system administrator needs all QPs created by all applications
   with same type bound to one counter.

The counter name and PID is only make sense when "PID" category are
configured.

This category can also be used in combine with others, e.g. QP type.

Link: https://lore.kernel.org/r/20200702082933.424537-2-leon@kernel.org
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10 16:50:53 -03:00
Linus Torvalds
a581387e41 Merge tag 'io_uring-5.8-2020-07-10' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:

 - Fix memleak for error path in registered files (Yang)

 - Export CQ overflow state in flags, necessary to fix a case where
   liburing doesn't know if it needs to enter the kernel (Xiaoguang)

 - Fix for a regression in when user memory is accounted freed, causing
   issues with back-to-back ring exit + init if the ulimit -l setting is
   very tight.

* tag 'io_uring-5.8-2020-07-10' of git://git.kernel.dk/linux-block:
  io_uring: account user memory freed when exit has been queued
  io_uring: fix memleak in io_sqe_files_register()
  io_uring: fix memleak in __io_sqe_files_update()
  io_uring: export cq overflow status to userspace
2020-07-10 09:57:57 -07:00
Masahiro Yamada
04aaca197f char: raw: do not leak CONFIG_MAX_RAW_DEVS to userspace
include/uapi/linux/raw.h leaks CONFIG_MAX_RAW_DEVS to userspace.

Userspace programs cannot use MAX_RAW_MINORS since CONFIG_MAX_RAW_DEVS
is not available anyway.

Remove the MAX_RAW_MINORS definition from the exported header, and use
CONFIG_MAX_RAW_DEVS in drivers/char/raw.c

While I was here, I converted printk(KERN_WARNING ...) to pr_warn(...)
and stretched the warning message.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20200617083313.183184-1-masahiroy@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-10 14:50:51 +02:00
Hans de Goede
316b003540 virt: vbox: Add a few new vmmdev request types to the userspace whitelist
Upstream VirtualBox has defined and is using a few new request types for
vmmdev requests passed through /dev/vboxguest to the hypervisor.

Add the defines for these to vbox_vmmdev_types.h and add add them to the
whitelists of vmmdev requests which userspace is allowed to make.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1789545
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200709120858.63928-7-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-10 13:45:32 +02:00
Hans de Goede
631beddc54 virt: vbox: Add support for the new VBG_IOCTL_ACQUIRE_GUEST_CAPABILITIES ioctl
Add support for the new VBG_IOCTL_ACQUIRE_GUEST_CAPABILITIES ioctl, this
is necessary for automatic resizing of the guest resolution to match the
VM-window size to work with the new VMSVGA virtual GPU which is now the
new default in VirtualBox.

BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1789545
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200709120858.63928-6-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-10 13:45:32 +02:00
Hans de Goede
f794db6841 virt: vbox: Fix VBGL_IOCTL_VMMDEV_REQUEST_BIG and _LOG req numbers to match upstream
Until this commit the mainline kernel version (this version) of the
vboxguest module contained a bug where it defined
VBGL_IOCTL_VMMDEV_REQUEST_BIG and VBGL_IOCTL_LOG using
_IOC(_IOC_READ | _IOC_WRITE, 'V', ...) instead of
_IO(V, ...) as the out of tree VirtualBox upstream version does.

Since the VirtualBox userspace bits are always built against VirtualBox
upstream's headers, this means that so far the mainline kernel version
of the vboxguest module has been failing these 2 ioctls with -ENOTTY.
I guess that VBGL_IOCTL_VMMDEV_REQUEST_BIG is never used causing us to
not hit that one and sofar the vboxguest driver has failed to actually
log any log messages passed it through VBGL_IOCTL_LOG.

This commit changes the VBGL_IOCTL_VMMDEV_REQUEST_BIG and VBGL_IOCTL_LOG
defines to match the out of tree VirtualBox upstream vboxguest version,
while keeping compatibility with the old wrong request defines so as
to not break the kernel ABI in case someone has been using the old
request defines.

Fixes: f6ddd094f5 ("virt: Add vboxguest driver for Virtual Box Guest integration UAPI")
Cc: stable@vger.kernel.org
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20200709120858.63928-2-hdegoede@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-10 13:40:19 +02:00
Danielle Ratson
a0f49b5486 devlink: Add a new devlink port split ability attribute and pass to netlink
Add a new attribute that indicates the split ability of devlink port.

Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 13:15:30 -07:00
Danielle Ratson
a21cf0a833 devlink: Add a new devlink port lanes attribute and pass to netlink
Add a new devlink port attribute that indicates the port's number of lanes.

Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.

The attribute is not passed to user space in case the number of lanes is
invalid (0).

Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 13:15:29 -07:00
Paolo Abeni
ac3b45f609 mptcp: add MPTCP socket diag interface
exposes basic inet socket attribute, plus some MPTCP socket
fields comprising PM status and MPTCP-level sequence numbers.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 12:38:41 -07:00
Paolo Abeni
3f935c75eb inet_diag: support for wider protocol numbers
After commit bf9765145b ("sock: Make sk_protocol a 16-bit value")
the current size of 'sdiag_protocol' is not sufficient to represent
the possible protocol values.

This change introduces a new inet diag request attribute to let
user space specify the relevant protocol number using u32 values.

The attribute is parsed by inet diag core on get/dump command
and the extended protocol value, if available, is preferred to
'sdiag_protocol' to lookup the diag handler.

The parse attributed are exposed to all the diag handlers via
the cb->data.

Note that inet_diag_dump_one_icsk() is left unmodified, as it
will not be used by protocol using the extended attribute.

Suggested-by: David S. Miller <davem@davemloft.net>
Co-developed-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-09 12:38:41 -07:00
Xiaoguang Wang
6d5f904904 io_uring: export cq overflow status to userspace
For those applications which are not willing to use io_uring_enter()
to reap and handle cqes, they may completely rely on liburing's
io_uring_peek_cqe(), but if cq ring has overflowed, currently because
io_uring_peek_cqe() is not aware of this overflow, it won't enter
kernel to flush cqes, below test program can reveal this bug:

static void test_cq_overflow(struct io_uring *ring)
{
        struct io_uring_cqe *cqe;
        struct io_uring_sqe *sqe;
        int issued = 0;
        int ret = 0;

        do {
                sqe = io_uring_get_sqe(ring);
                if (!sqe) {
                        fprintf(stderr, "get sqe failed\n");
                        break;;
                }
                ret = io_uring_submit(ring);
                if (ret <= 0) {
                        if (ret != -EBUSY)
                                fprintf(stderr, "sqe submit failed: %d\n", ret);
                        break;
                }
                issued++;
        } while (ret > 0);
        assert(ret == -EBUSY);

        printf("issued requests: %d\n", issued);

        while (issued) {
                ret = io_uring_peek_cqe(ring, &cqe);
                if (ret) {
                        if (ret != -EAGAIN) {
                                fprintf(stderr, "peek completion failed: %s\n",
                                        strerror(ret));
                                break;
                        }
                        printf("left requets: %d\n", issued);
                        continue;
                }
                io_uring_cqe_seen(ring, cqe);
                issued--;
                printf("left requets: %d\n", issued);
        }
}

int main(int argc, char *argv[])
{
        int ret;
        struct io_uring ring;

        ret = io_uring_queue_init(16, &ring, 0);
        if (ret) {
                fprintf(stderr, "ring setup failed: %d\n", ret);
                return 1;
        }

        test_cq_overflow(&ring);
        return 0;
}

To fix this issue, export cq overflow status to userspace by adding new
IORING_SQ_CQ_OVERFLOW flag, then helper functions() in liburing, such as
io_uring_peek_cqe, can be aware of this cq overflow and do flush accordingly.

Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-08 19:17:06 -06:00
Meir Lichtinger
065e0d42a0 ethtool: Add support for 100Gbps per lane link modes
Define 100G, 200G and 400G link modes using 100Gbps per lane

LR, ER and FR are defined as a single link mode because they are
using same technology and by design are fully interoperable.
EEPROM content indicates if the module is LR, ER, or FR, and the
user space ethtool decoder is planned to support decoding these
modes in the EEPROM.

Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
CC: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 15:30:42 -07:00
Jim Mattson
1aa561b1a4 kvm: x86: Add "last CPU" to some KVM_EXIT information
More often than not, a failed VM-entry in an x86 production
environment is induced by a defective CPU. To help identify the bad
hardware, include the id of the last logical CPU to run a vCPU in the
information provided to userspace on a KVM exit for failed VM-entry or
for KVM internal errors not associated with emulation. The presence of
this additional information is indicated by a new capability,
KVM_CAP_LAST_CPU.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Message-Id: <20200603235623.245638-5-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:45 -04:00
David S. Miller
e80a07b244 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Support for rejecting packets from the prerouting chain, from
   Laura Garcia Liebana.

2) Remove useless assignment in pipapo, from Stefano Brivio.

3) On demand hook registration in IPVS, from Julian Anastasov.

4) Expire IPVS connection from process context to not overload
   timers, also from Julian.

5) Fallback to conntrack TCP tracker to handle connection reuse
   in IPVS, from Julian Anastasov.

6) Several patches to support for chain bindings.

7) Expose enum nft_chain_flags through UAPI.

8) Reject unsupported chain flags from the netlink control plane.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-08 12:42:40 -07:00
Matias Bjørling
82394db738 block: add capacity field to zone descriptors
In the zoned storage model, the sectors within a zone are typically all
writeable. With the introduction of the Zoned Namespace (ZNS) Command
Set in the NVM Express organization, the model was extended to have a
specific writeable capacity.

Extend the zone descriptor data structure with a zone capacity field to
indicate to the user how many sectors in a zone are writeable.

Introduce backward compatibility in the zone report ioctl by extending
the zone report header data structure with a flags field to indicate if
the capacity field is available.

Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Javier González <javier.gonz@samsung.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Matias Bjørling <matias.bjorling@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-07-08 16:16:19 +02:00
Stanislav Fomichev
f5836749c9 bpf: Add BPF_CGROUP_INET_SOCK_RELEASE hook
Sometimes it's handy to know when the socket gets freed. In
particular, we'd like to try to use a smarter allocation of
ports for bpf_bind and explore the possibility of limiting
the number of SOCK_DGRAM sockets the process can have.

Implement BPF_CGROUP_INET_SOCK_RELEASE hook that triggers on
inet socket release. It triggers only for userspace sockets
(not in-kernel ones) and therefore has the same semantics as
the existing BPF_CGROUP_INET_SOCK_CREATE.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200706230128.4073544-2-sdf@google.com
2020-07-08 01:03:31 +02:00
Daniel Lezcano
1ce50e7d40 thermal: core: genetlink support for events/cmd/sampling
Initially the thermal framework had a very simple notification
mechanism to send generic netlink messages to the userspace.

The notification function was never called from anywhere and the
corresponding dead code was removed. It was probably a first attempt
to introduce the netlink notification.

At LPC2018, the presentation "Linux thermal: User kernel interface",
proposed to create the notifications to the userspace via a kfifo.

The advantage of the kfifo is the performance. It is usually used from
a 1:1 communication channel where a driver captures data and sends it
as fast as possible to a userspace process.

The drawback is that only one process uses the notification channel
exclusively, thus no other process is allowed to use the channel to
get temperature or notifications.

This patch defines a generic netlink API to discover the current
thermal setup and adds event notifications as well as temperature
sampling. As any genetlink protocol, it can evolve and the versioning
allows to keep the backward compatibility.

In order to prevent the user from getting flooded with data on a
single channel, there are two multicast channels, one for the
temperature sampling when the thermal zone is updated and another one
for the events, so the user can get the events only without the
thermal zone temperature sampling.

Also, a list of commands to discover the thermal setup is added and
can be extended when needed.

Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20200706105538.2159-3-daniel.lezcano@linaro.org
2020-07-07 15:55:21 +02:00
Xu Yilun
09d8615014 fpga: dfl: afu: add AFU interrupt support
AFU (Accelerated Function Unit) is dynamic region of the DFL based FPGA,
and always defined by users. Some DFL based FPGA cards allow users to
implement their own interrupts in AFU. In order to support this,
hardware implements a new UINT (AFU Interrupt) private feature with
related capability register which describes the number of supported
AFU interrupts as well as the local index of the interrupts for
software enumeration, and from software side, driver follows the common
DFL interrupt notification and handling mechanism, and it implements
two ioctls below for user to query number of irqs supported and set/unset
interrupt triggers.

 Ioctls:
 * DFL_FPGA_PORT_UINT_GET_IRQ_NUM
   get the number of irqs, which is used to determine how many interrupts
   UINT feature supports.

 * DFL_FPGA_PORT_UINT_SET_IRQ
   set/unset eventfds as AFU interrupt triggers.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
2020-07-06 21:37:08 -07:00
Xu Yilun
d43f20bae5 fpga: dfl: fme: add interrupt support for global error reporting
Error reporting interrupt is very useful to notify users that some
errors are detected by the hardware. Once users are notified, they
could query hardware logged error states, no need to continuously
poll on these states.

This patch adds interrupt support for fme global error reporting sub
feature. It follows the common DFL interrupt notification and handling
mechanism. And it implements two ioctls below for user to query
number of irqs supported, and set/unset interrupt triggers.

 Ioctls:
 * DFL_FPGA_FME_ERR_GET_IRQ_NUM
   get the number of irqs, which is used to determine whether/how many
   interrupts fme error reporting feature supports.

 * DFL_FPGA_FME_ERR_SET_IRQ
   set/unset given eventfds as fme error reporting interrupt triggers.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
2020-07-06 21:35:42 -07:00
Xu Yilun
fe6a3d6521 fpga: dfl: afu: add interrupt support for port error reporting
Error reporting interrupt is very useful to notify users that some
errors are detected by the hardware. Once users are notified, they
could query hardware logged error states, no need to continuously
poll on these states.

This patch adds interrupt support for port error reporting sub feature.
It follows the common DFL interrupt notification and handling mechanism,
implements two ioctl commands below for user to query number of irqs
supported, and set/unset interrupt triggers.

 Ioctls:
 * DFL_FPGA_PORT_ERR_GET_IRQ_NUM
   get the number of irqs, which is used to determine whether/how many
   interrupts error reporting feature supports.

 * DFL_FPGA_PORT_ERR_SET_IRQ
   set/unset given eventfds as error interrupt triggers.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Wu Hao <hao.wu@intel.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
2020-07-06 21:34:46 -07:00
Yishai Hadas
6c01e6b218 IB/uverbs: Expose UAPI to query MR
Expose UAPI to query MR, this will let user space application that
didn't allocate the MR but has access to by owning the matching command
FD to retrieve its information.

Link: https://lore.kernel.org/r/20200630093916.332097-8-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06 19:50:34 -03:00
Yishai Hadas
05f71ef979 RDMA/mlx5: Introduce UAPI to query PD attributes
Introduce UAPI to query PD attributes, this can be used to retrieve PD
attributes by having the PD handle of the created one and owning the
command FD for the ucontxet.

Link: https://lore.kernel.org/r/20200630093916.332097-7-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06 19:50:34 -03:00
Yishai Hadas
0fb556b2b5 RDMA/mlx5: Implement the query ucontext functionality
Implement the query ucontext functionality by returning the original
ucontext data as part of an extra mlx5 attribute that holds the driver
UAPI response.

Link: https://lore.kernel.org/r/20200630093916.332097-6-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06 19:50:33 -03:00
Yishai Hadas
1c8fb1ea5a IB/uverbs: Expose UAPI to query ucontext
Expose UAPI to query ucontext, this will let user space application that
didn't allocate the ucontext but has access to by owning the matching
command FD to retrieve the ucontext information.

Link: https://lore.kernel.org/r/20200630093916.332097-4-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06 19:50:33 -03:00
Brian Starkey
9ac2b63791 drm: drm_fourcc: Add generic alias for 16_16_TILE modifier
In cases such as DRM_FORMAT_MOD_SAMSUNG_16_16_TILE, the modifier
describes a generic pixel re-ordering which can be applicable to
multiple vendors.

Define an alias: DRM_FORMAT_MOD_GENERIC_16_16_TILE, which can be
used to describe this layout in a vendor-neutral way, and add a
comment about the expected usage of such "generic" modifiers.

Changes in v2:
 - Move note about future cases to comment (Daniel)

Signed-off-by: Brian Starkey <brian.starkey@arm.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200626164800.11595-1-brian.starkey@arm.com
2020-07-06 14:58:26 +01:00
Tiezhu Yang
ea1be1e59b serial: Remove duplicated macro definition of port type
There exists the same macro definition of port type from 0 to 13
in include/uapi/linux/serial.h, remove these duplicated code in
include/uapi/linux/serial_core.h which includes the former header.

Acked-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1588853015-28392-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-06 14:06:08 +02:00
David S. Miller
f91c031e65 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2020-07-04

The following pull-request contains BPF updates for your *net-next* tree.

We've added 73 non-merge commits during the last 17 day(s) which contain
a total of 106 files changed, 5233 insertions(+), 1283 deletions(-).

The main changes are:

1) bpftool ability to show PIDs of processes having open file descriptors
   for BPF map/program/link/BTF objects, relying on BPF iterator progs
   to extract this info efficiently, from Andrii Nakryiko.

2) Addition of BPF iterator progs for dumping TCP and UDP sockets to
   seq_files, from Yonghong Song.

3) Support access to BPF map fields in struct bpf_map from programs
   through BTF struct access, from Andrey Ignatov.

4) Add a bpf_get_task_stack() helper to be able to dump /proc/*/stack
   via seq_file from BPF iterator progs, from Song Liu.

5) Make SO_KEEPALIVE and related options available to bpf_setsockopt()
   helper, from Dmitry Yakunin.

6) Optimize BPF sk_storage selection of its caching index, from Martin
   KaFai Lau.

7) Removal of redundant synchronize_rcu()s from BPF map destruction which
   has been a historic leftover, from Alexei Starovoitov.

8) Several improvements to test_progs to make it easier to create a shell
   loop that invokes each test individually which is useful for some CIs,
   from Jesper Dangaard Brouer.

9) Fix bpftool prog dump segfault when compiled without skeleton code on
   older clang versions, from John Fastabend.

10) Bunch of cleanups and minor improvements, from various others.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-04 17:48:34 -07:00
Pablo Neira Ayuso
c1f79a2eef netfilter: nf_tables: reject unsupported chain flags
Bail out if userspace sends unsupported chain flags.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-07-04 02:51:28 +02:00
Pablo Neira Ayuso
d0e2c7de92 netfilter: nf_tables: add NFT_CHAIN_BINDING
This new chain flag specifies that:

* the kernel dynamically allocates the chain name, if no chain name
  is specified.

* If the immediate expression that refers to this chain is removed,
  then this bound chain (and its content) is destroyed.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-07-04 01:22:14 +02:00
Pablo Neira Ayuso
67c49de4ad netfilter: nf_tables: expose enum nft_chain_flags through UAPI
This enum definition was never exposed through UAPI. Rename
NFT_BASE_CHAIN to NFT_CHAIN_BASE for consistency.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-07-04 01:18:41 +02:00
Pablo Neira Ayuso
51d70f181f netfilter: nf_tables: add NFTA_VERDICT_CHAIN_ID attribute
This netlink attribute allows you to identify the chain to jump/goto by
means of the chain ID.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-07-04 01:18:41 +02:00
Pablo Neira Ayuso
837830a4b4 netfilter: nf_tables: add NFTA_RULE_CHAIN_ID attribute
This new netlink attribute allows you to add rules to chains by the
chain ID.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-07-04 01:18:41 +02:00
Pablo Neira Ayuso
74cccc3d38 netfilter: nf_tables: add NFTA_CHAIN_ID attribute
This netlink attribute allows you to refer to chains inside a
transaction as an alternative to the name and the handle. The chain
binding support requires this new chain ID approach.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-07-04 01:18:41 +02:00
Neil Armstrong
d6528ec883 drm/fourcc: Add modifier definitions for describing Amlogic Video Framebuffer Compression
Amlogic uses a proprietary lossless image compression protocol and format
for their hardware video codec accelerators, either video decoders or
video input encoders.

It considerably reduces memory bandwidth while writing and reading
frames in memory.

The underlying storage is considered to be 3 components, 8bit or 10-bit
per component, YCbCr 420, single plane :
- DRM_FORMAT_YUV420_8BIT
- DRM_FORMAT_YUV420_10BIT

This modifier will be notably added to DMA-BUF frames imported from the V4L2
Amlogic VDEC decoder.

This introduces the basic layout composed of:
- a body content organized in 64x32 superblocks with 4096 bytes per
  superblock in default mode.
- a 32 bytes per 128x64 header block

This layout is tranferrable between Amlogic SoCs supporting this modifier.

The Memory Saving option exist changing the layout superblock size to save memory when
using 8bit components pixels size.

Finally is also adds the Scatter Memory layout, meaning the header contains IOMMU
references to the compressed frames content to optimize memory access
and layout.

In this mode, only the header memory address is needed, thus the content
memory organization is tied to the current producer execution and cannot
be saved/dumped neither transferrable between Amlogic SoCs supporting this
modifier.

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200703080728.25207-2-narmstrong@baylibre.com
2020-07-03 10:12:27 +02:00
Horatiu Vultur
36a8e8e265 bridge: Extend br_fill_ifinfo to return MPR status
This patch extends the function br_fill_ifinfo to return also the MRP
status for each instance on a bridge. It also adds a new filter
RTEXT_FILTER_MRP to return the MRP status only when this is set, not to
interfer with the vlans. The MRP status is return only on the bridge
interfaces.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:19:15 -07:00
Horatiu Vultur
e4266b991f bridge: uapi: mrp: Extend MRP attributes to get the status
Add MRP attribute IFLA_BRIDGE_MRP_INFO to allow the userspace to get the
current state of the MRP instances. This is a nested attribute that
contains other attributes like, ring id, index of primary and secondary
port, priority, ring state, ring role.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-02 14:19:15 -07:00
Peter Zijlstra
78c2141b65 Merge branch 'perf/vlbr' 2020-07-02 15:51:48 +02:00
Dave Airlie
9555152beb Merge tag 'amd-drm-next-5.9-2020-07-01' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.9-2020-07-01:

amdgpu:
- DC DMUB updates
- HDCP fixes
- Thermal interrupt fixes
- Add initial support for Sienna Cichlid GPU
- Add support for unique id on Arcturus
- Major swSMU code cleanup
- Skip BAR resizing if the bios already did id
- Fixes for DCN bandwidth calculations
- Runtime PM reference count fixes
- Add initial UVD support for SI
- Add support for ASSR on eDP links
- Lots of misc fixes and cleanups
- Enable runtime PM on vega10 boards that support BACO
- RAS fixes
- SR-IOV fixes
- Use IP discovery table on renoir
- DC stream synchronization fixes

amdkfd:
- Track SDMA usage per process
- Fix GCC10 compiler warnings
- Locking fix

radeon:
- Default to on chip GART for AGP boards on all arches
- Runtime PM reference count fixes

UAPI:
- Update comments to clarify MTYPE

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701155041.1102829-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-07-02 15:17:31 +10:00
Shyam Sundar
62e9dd1777 scsi: qla2xxx: Change in PUREX to handle FPIN ELS requests
SAN Congestion Management generates ELS pkts whose size can vary and be >
64 bytes. Change the PUREX handling code to support non-standard ELS pkt
size.

Link: https://lore.kernel.org/r/20200630102229.29660-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Shyam Sundar <ssundar@marvell.com>
Signed-off-by: Arun Easi <aeasi@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-01 22:24:16 -04:00
Song Liu
fa28dcb82a bpf: Introduce helper bpf_get_task_stack()
Introduce helper bpf_get_task_stack(), which dumps stack trace of given
task. This is different to bpf_get_stack(), which gets stack track of
current task. One potential use case of bpf_get_task_stack() is to call
it from bpf_iter__task and dump all /proc/<pid>/stack to a seq_file.

bpf_get_task_stack() uses stack_trace_save_tsk() instead of
get_perf_callchain() for kernel stack. The benefit of this choice is that
stack_trace_save_tsk() doesn't require changes in arch/. The downside of
using stack_trace_save_tsk() is that stack_trace_save_tsk() dumps the
stack trace to unsigned long array. For 32-bit systems, we need to
translate it to u64 array.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200630062846.664389-3-songliubraving@fb.com
2020-07-01 08:23:19 -07:00
David S. Miller
e708e2bd55 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2020-06-30

The following pull-request contains BPF updates for your *net* tree.

We've added 28 non-merge commits during the last 9 day(s) which contain
a total of 35 files changed, 486 insertions(+), 232 deletions(-).

The main changes are:

1) Fix an incorrect verifier branch elimination for PTR_TO_BTF_ID pointer
   types, from Yonghong Song.

2) Fix UAPI for sockmap and flow_dissector progs that were ignoring various
   arguments passed to BPF_PROG_{ATTACH,DETACH}, from Lorenz Bauer & Jakub Sitnicki.

3) Fix broken AF_XDP DMA hacks that are poking into dma-direct and swiotlb
   internals and integrate it properly into DMA core, from Christoph Hellwig.

4) Fix RCU splat from recent changes to avoid skipping ingress policy when
   kTLS is enabled, from John Fastabend.

5) Fix BPF ringbuf map to enforce size to be the power of 2 in order for its
   position masking to work, from Andrii Nakryiko.

6) Fix regression from CAP_BPF work to re-allow CAP_SYS_ADMIN for loading
   of network programs, from Maciej Żenczykowski.

7) Fix libbpf section name prefix for devmap progs, from Jesper Dangaard Brouer.

8) Fix formatting in UAPI documentation for BPF helpers, from Quentin Monnet.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 14:20:45 -07:00
David S. Miller
d9b8b9845f Merge tag 'batadv-next-for-davem-20200630' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:

====================
This feature/cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - update mailing list URL, by Sven Eckelmann

 - fix typos and grammar in documentation, by Sven Eckelmann

 - introduce a configurable per interface hop penalty,
   by Linus Luessing
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-30 12:59:15 -07:00
Merlijn Wajer
c463bb2a8f Input: add SW_MACHINE_COVER
This event code represents the state of a removable cover of a device.
Value 0 means that the cover is open or removed, value 1 means that the
cover is closed.

Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Merlijn Wajer <merlijn@wizzup.org>
Link: https://lore.kernel.org/r/20200612125402.18393-2-merlijn@wizzup.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2020-06-30 12:06:49 -07:00
Jean-Philippe Brucker
970471914c iommu: Allow page responses without PASID
Some PCIe devices do not expect a PASID value in PRI Page Responses.
If the "PRG Response PASID Required" bit in the PRI capability is zero,
then the OS should not set the PASID field. Similarly on Arm SMMU,
responses to stall events do not have a PASID.

Currently iommu_page_response() systematically checks that the PASID in
the page response corresponds to the one in the page request. This can't
work with virtualization because a page response coming from a guest OS
won't have a PASID if the passed-through device does not require one.

Add a flag to page requests that declares whether the corresponding
response needs to have a PASID. When this flag isn't set, allow page
responses without PASID.

Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/r/20200616144712.748818-1-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
2020-06-30 11:49:21 +02:00
Dave Airlie
f75020fcb9 Merge tag 'drm-misc-next-2020-06-26' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.9:

Cross-subsystem Changes:
- Improve dma-buf docs.

Core Changes:
- Add NV15, Q410, Q401 yuv formats.
- Add uncompressed AFBC modifier.
- Add DP helepr for reading Ignore MSA from DPCD.
- Add missing panel type for some panels
- Optimize drm/mm hole handling.
- Constify connector to infoframe functions.
- Add debugfs for VRR monitor range.

Driver Changes:
- Assorted small bugfixes in panfrost, malidp, panel/otm8009a.
- Convert tfp410 dt bindings to yaml, and rework time calculations.
- Add support for a few more simple panels.
- Cleanups and optimizations for ast.
- Allow adv7511 and simple-bridge to be used without connector creation.
- Cleanups to dw-hdmi function prototypes.
- Remove enabled bool from tiny/repaper and mipi-dbi, atomic handles it.
- Remove unused header file from dw-mipi-dsi
- Begin removing ttm_bo->offset.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b1e53620-7937-895c-bfcf-ed208be59c7c@linux.intel.com
2020-06-30 14:04:00 +10:00
Amit Cohen
ecc31c6024 ethtool: Add link extended state
Currently, drivers can only tell whether the link is up/down using
LINKSTATE_GET, but no additional information is given.

Add attributes to LINKSTATE_GET command in order to allow drivers
to expose the user more information in addition to link state to ease
the debug process, for example, reason for link down state.

Extended state consists of two attributes - link_ext_state and
link_ext_substate. The idea is to avoid 'vendor specific' states in order
to prevent drivers to use specific link_ext_state that can be in the future
common link_ext_state.

The substates allows drivers to add more information to the common
link_ext_state. For example, vendor can expose 'Autoneg' as link_ext_state
and add 'No partner detected during force mode' as link_ext_substate.

If a driver cannot pinpoint the extended state with the substate
accuracy, it is free to expose only the extended state and omit the
substate attribute.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:45:02 -07:00
Petr Machata
aee9caa03f net: sched: sch_red: Add qevents "early_drop" and "mark"
In order to allow acting on dropped and/or ECN-marked packets, add two new
qevents to the RED qdisc: "early_drop" and "mark". Filters attached at
"early_drop" block are executed as packets are early-dropped, those
attached at the "mark" block are executed as packets are ECN-marked.

Two new attributes are introduced: TCA_RED_EARLY_DROP_BLOCK with the block
index for the "early_drop" qevent, and TCA_RED_MARK_BLOCK for the "mark"
qevent. Absence of these attributes signifies "don't care": no block is
allocated in that case, or the existing blocks are left intact in case of
the change callback.

For purposes of offloading, blocks attached to these qevents appear with
newly-introduced binder types, FLOW_BLOCK_BINDER_TYPE_RED_EARLY_DROP and
FLOW_BLOCK_BINDER_TYPE_RED_MARK.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-29 17:08:28 -07:00
Maarten Lankhorst
60e9eabf41 Backmerge remote-tracking branch 'drm/drm-next' into drm-misc-next
Some conflicts with ttm_bo->offset removal, but drm-misc-next needs updating to v5.8.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2020-06-29 12:16:26 +02:00
Greg Kroah-Hartman
9cf6ffae38 Merge 5.8-rc3 into usb-next
We want the USB fixes in here, and this resolves a merge issue found in
linux-next.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-29 08:22:27 +02:00
Martin
fe80536acf bareudp: Added attribute to enable & disable rx metadata collection
Metadata need not be collected in receive if the packet from bareudp
device is not targeted to openvswitch.

Signed-off-by: Martin <martin.varghese@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-06-28 20:48:20 -07:00