Commit Graph

888716 Commits

Author SHA1 Message Date
Pavel Begunkov
2b85edfc0c io_uring: batch getting pcpu references
percpu_ref_tryget() has its own overhead. Instead getting a reference
for each request, grab a bunch once per io_submit_sqes().

~5% throughput boost for a "submit and wait 128 nops" benchmark.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>

__io_req_free_empty() -> __io_req_do_free()

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:04:02 -07:00
Pavel Begunkov
4e5ef02317 pcpu_ref: add percpu_ref_tryget_many()
Add percpu_ref_tryget_many(), which works the same way as
percpu_ref_tryget(), but grabs specified number of refs.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Dennis Zhou <dennis@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:04:02 -07:00
Jens Axboe
c1ca757bd6 io_uring: add IORING_OP_MADVISE
This adds support for doing madvise(2) through io_uring. We assume that
any operation can block, and hence punt everything async. This could be
improved, but hard to make bullet proof. The async punt ensures it's
safe.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:04:02 -07:00
Jens Axboe
db08ca2525 mm: make do_madvise() available internally
This is in preparation for enabling this functionality through io_uring.
Add a helper that is just exporting what sys_madvise() does, and have the
system call use it.

No functional changes in this patch.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:04:02 -07:00
Jens Axboe
4840e418c2 io_uring: add IORING_OP_FADVISE
This adds support for doing fadvise through io_uring. We assume that
WILLNEED doesn't block, but that DONTNEED may block.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:04:01 -07:00
Jens Axboe
ba04291eb6 io_uring: allow use of offset == -1 to mean file position
This behaves like preadv2/pwritev2 with offset == -1, it'll use (and
update) the current file position. This obviously comes with the caveat
that if the application has multiple read/writes in flight, then the
end result will not be as expected. This is similar to threads sharing
a file descriptor and doing IO using the current file position.

Since this feature isn't easily detectable by doing a read or write,
add a feature flags, IORING_FEAT_RW_CUR_POS, to allow applications to
detect presence of this feature.

Reported-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:59 -07:00
Jens Axboe
3a6820f2bb io_uring: add non-vectored read/write commands
For uses cases that don't already naturally have an iovec, it's easier
(or more convenient) to just use a buffer address + length. This is
particular true if the use case is from languages that want to create
a memory safe abstraction on top of io_uring, and where introducing
the need for the iovec may impose an ownership issue. For those cases,
they currently need an indirection buffer, which means allocating data
just for this purpose.

Add basic read/write that don't require the iovec.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:59 -07:00
Jens Axboe
e94f141bd2 io_uring: improve poll completion performance
For busy IORING_OP_POLL_ADD workloads, we can have enough contention
on the completion lock that we fail the inline completion path quite
often as we fail the trylock on that lock. Add a list for deferred
completions that we can use in that case. This helps reduce the number
of async offloads we have to do, as if we get multiple completions in
a row, we'll piggy back on to the poll_llist instead of having to queue
our own offload.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:59 -07:00
Jens Axboe
ad3eb2c89f io_uring: split overflow state into SQ and CQ side
We currently check ->cq_overflow_list from both SQ and CQ context, which
causes some bouncing of that cache line. Add separate bits of state for
this instead, so that the SQ side can check using its own state, and
likewise for the CQ side.

This adds ->sq_check_overflow with the SQ state, and ->cq_check_overflow
with the CQ state. If we hit an overflow condition, both of these bits
are set. Likewise for overflow flush clear, we clear both bits. For the
fast path of just checking if there's an overflow condition on either
the SQ or CQ side, we can use our own private bit for this.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:59 -07:00
Jens Axboe
d3656344fe io_uring: add lookup table for various opcode needs
We currently have various switch statements that check if an opcode needs
a file, mm, etc. These are hard to keep in sync as opcodes are added. Add
a struct io_op_def that holds all of this information, so we have just
one spot to update when opcodes are added.

This also enables us to NOT allocate req->io if a deferred command
doesn't need it, and corrects some mistakes we had in terms of what
commands need mm context.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:59 -07:00
Jens Axboe
add7b6b85a io_uring: remove two unnecessary function declarations
__io_free_req() and io_double_put_req() aren't used before they are
defined, so we can kill these two forwards.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:59 -07:00
Pavel Begunkov
32fe525b6d io_uring: move *queue_link_head() from common path
Move io_queue_link_head() to links handling code in io_submit_sqe(),
so it wouldn't need extra checks and would have better data locality.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:59 -07:00
Pavel Begunkov
9d76377f7e io_uring: rename prev to head
Calling "prev" a head of a link is a bit misleading. Rename it

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:59 -07:00
Jens Axboe
ce35a47a3a io_uring: add IOSQE_ASYNC
io_uring defaults to always doing inline submissions, if at all
possible. But for larger copies, even if the data is fully cached, that
can take a long time. Add an IOSQE_ASYNC flag that the application can
set on the SQE - if set, it'll ensure that we always go async for those
kinds of requests. Use the io-wq IO_WQ_WORK_CONCURRENT flag to ensure we
get the concurrency we desire for this case.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:59 -07:00
Jens Axboe
895e2ca0f6 io-wq: support concurrent non-blocking work
io-wq assumes that work will complete fast (and not block), so it
doesn't create a new worker when work is enqueued, if we already have
at least one worker running. This is done on the assumption that if work
is running, then it will complete fast.

Add an option to force io-wq to fork a new worker for work queued. This
is signaled by setting IO_WQ_WORK_CONCURRENT on the work item. For that
case, io-wq will create a new worker, even though workers are already
running.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:59 -07:00
Jens Axboe
eddc7ef52a io_uring: add support for IORING_OP_STATX
This provides support for async statx(2) through io_uring.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:54 -07:00
Jens Axboe
3934e36f60 fs: make two stat prep helpers available
To implement an async stat, we need to provide the flags mapping and
the statx user copy. Make them available internally, through
fs/internal.h.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:54 -07:00
Jens Axboe
05f3fb3c53 io_uring: avoid ring quiesce for fixed file set unregister and update
We currently fully quiesce the ring before an unregister or update of
the fixed fileset. This is very expensive, and we can be a bit smarter
about this.

Add a percpu refcount for the file tables as a whole. Grab a percpu ref
when we use a registered file, and put it on completion. This is cheap
to do. Upon removal of a file from a set, switch the ref count to atomic
mode. When we hit zero ref on the completion side, then we know we can
drop the previously registered files. When the old files have been
dropped, switch the ref back to percpu mode for normal operation.

Since there's a period between doing the update and the kernel being
done with it, add a IORING_OP_FILES_UPDATE opcode that can perform the
same action. The application knows the update has completed when it gets
the CQE for it. Between doing the update and receiving this completion,
the application must continue to use the unregistered fd if submitting
IO on this particular file.

This takes the runtime of test/file-register from liburing from 14s to
about 0.7s.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:03:50 -07:00
Jens Axboe
b5dba59e0c io_uring: add support for IORING_OP_CLOSE
This works just like close(2), unsurprisingly. We remove the file
descriptor and post the completion inline, then offload the actual
(potential) last file put to async context.

Mark the async part of this work as uncancellable, as we really must
guarantee that the latter part of the close is run.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:01:53 -07:00
Jens Axboe
0c9d5ccd26 io-wq: add support for uncancellable work
Not all work can be cancelled, some of it we may need to guarantee
that it runs to completion. Allow the caller to set IO_WQ_WORK_NO_CANCEL
on work that must not be cancelled. Note that the caller work function
must also check for IO_WQ_WORK_NO_CANCEL on work that is marked
IO_WQ_WORK_CANCEL.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:01:53 -07:00
Jens Axboe
6e802a4ba0 fs: move filp_close() outside of __close_fd_get_file()
Just one caller of this, and just use filp_close() there manually.
This is important to allow async close/removal of the fd.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:01:53 -07:00
Jens Axboe
15b71abe7b io_uring: add support for IORING_OP_OPENAT
This works just like openat(2), except it can be performed async. For
the normal case of a non-blocking path lookup this will complete
inline. If we have to do IO to perform the open, it'll be done from
async context.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:01:53 -07:00
Jens Axboe
35cb6d54c1 fs: make build_open_flags() available internally
This is a prep patch for supporting non-blocking open from io_uring.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:01:53 -07:00
Jens Axboe
d63d1b5edb io_uring: add support for fallocate()
This exposes fallocate(2) through io_uring.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:01:53 -07:00
Jens Axboe
4d92748373 Merge branch 'io_uring-5.5' into for-5.6/io_uring-vfs
Pull in compatability fix for the files_update command.

* io_uring-5.5:
  io_uring: fix compat for IORING_REGISTER_FILES_UPDATE
2020-01-20 17:01:17 -07:00
Eugene Syromiatnikov
1292e972ff io_uring: fix compat for IORING_REGISTER_FILES_UPDATE
fds field of struct io_uring_files_update is problematic with regards
to compat user space, as pointer size is different in 32-bit, 32-on-64-bit,
and 64-bit user space.  In order to avoid custom handling of compat in
the syscall implementation, make fds __u64 and use u64_to_user_ptr in
order to retrieve it.  Also, align the field naturally and check that
no garbage is passed there.

Fixes: c3a31e6056 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-01-20 17:00:44 -07:00
Jens Axboe
fa7773deb3 Merge branch 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into for-5.6/io_uring-vfs
Pull in Al's openat2 branch, since we'll need that for the openat2
support.

* 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  Documentation: path-lookup: include new LOOKUP flags
  selftests: add openat2(2) selftests
  open: introduce openat2(2) syscall
  namei: LOOKUP_{IN_ROOT,BENEATH}: permit limited ".." resolution
  namei: LOOKUP_IN_ROOT: chroot-like scoped resolution
  namei: LOOKUP_BENEATH: O_BENEATH-like scoped resolution
  namei: LOOKUP_NO_XDEV: block mountpoint crossing
  namei: LOOKUP_NO_MAGICLINKS: block magic-link resolution
  namei: LOOKUP_NO_SYMLINKS: block symlink resolution
  namei: allow set_root() to produce errors
  namei: allow nd_jump_link() to produce errors
  nsfs: clean-up ns_get_path() signature to return int
  namei: only return -ECHILD from follow_dotdot_rcu()
2020-01-19 19:47:04 -07:00
Linus Torvalds
def9d27807 Linux 5.5-rc7 2020-01-19 16:02:49 -08:00
Linus Torvalds
7008ee1210 RISC-V updates for v5.5-rc7
Three fixes for RISC-V:
 
 - Don't free and reuse memory containing the code that CPUs parked at
   boot reside in.
 
 - Fix rv64 build problems for ubsan and some modules by adding logical
   and arithmetic shift helpers for 128-bit values.  These are from
   libgcc and are similar to what's present for ARM64.
 
 - Fix vDSO builds to clean up their own temporary files.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl4klXMACgkQx4+xDQu9
 KkshPhAAiExCAl2JpZxoeuyQmjS7X68au5CWWJa+uB5osAgxJSPk4XJt9QeagOFw
 wpKmefQDwPKXQuoD0VmNSdJMioBvgzFLqftoc2D9GAv3A0MB4miSsZkUheVbUifN
 jtc3tc3jOCiVNOb0lbX+B6NL+qentvV6CTmujrf79wDBZpGPzKSM2S2OZMvFijVY
 B7ijF1bqXBZg6weE8xdefJhBfDmd7vDBKmMJtv1RUbiOgoBGqjM8QtaXVrUz4Px0
 NLlTleZL4grZVCepJ4psambm5gcZ8UilAe/ywhhrSFOSNCTYKB3ST+ci9VC4t5Vx
 TiR6MnDV1qAl2Uh6aoOhZECpggge9zQOyER7QGbleNplxavhi6jsRgeV9hbqeyBZ
 FCanzqO33irRwrtl7lNsPiUv3XWyyGH5yQLxA9wPq/W9dJkO6Z8pl5Fq0kI/oJNj
 WtlVIp2EnkXJUmXiMBTHerLoBJAVnu+S2HRIeRqOEUSKT4bFP28kq3T6ctP4QuoT
 3F2k9A3DOIZPY8dhXIdFSdcM0IbfRliFy/9hHLt6JdSa35poxGUBpOafDq87n9/3
 UtbkLbxJd56EWOSx5l2bLJ/b4FjJX6powfV9cwjrHjrzoiAqc8z0lfQRGlbGjeYb
 3XTTXhLIspGMK8acm8/p2I6uwDVGFMveEdsLJ8UWAiYjJkLSFEk=
 =SfqL
 -----END PGP SIGNATURE-----

Merge tag 'riscv/for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:
 "Three fixes for RISC-V:

   - Don't free and reuse memory containing the code that CPUs parked at
     boot reside in.

   - Fix rv64 build problems for ubsan and some modules by adding
     logical and arithmetic shift helpers for 128-bit values. These are
     from libgcc and are similar to what's present for ARM64.

   - Fix vDSO builds to clean up their own temporary files"

* tag 'riscv/for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Less inefficient gcc tishift helpers (and export their symbols)
  riscv: delete temporary files
  riscv: make sure the cores stay looping in .Lsecondary_park
2020-01-19 12:10:28 -08:00
Linus Torvalds
11a8272947 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:

 1) Fix non-blocking connect() in x25, from Martin Schiller.

 2) Fix spurious decryption errors in kTLS, from Jakub Kicinski.

 3) Netfilter use-after-free in mtype_destroy(), from Cong Wang.

 4) Limit size of TSO packets properly in lan78xx driver, from Eric
    Dumazet.

 5) r8152 probe needs an endpoint sanity check, from Johan Hovold.

 6) Prevent looping in tcp_bpf_unhash() during sockmap/tls free, from
    John Fastabend.

 7) hns3 needs short frames padded on transmit, from Yunsheng Lin.

 8) Fix netfilter ICMP header corruption, from Eyal Birger.

 9) Fix soft lockup when low on memory in hns3, from Yonglong Liu.

10) Fix NTUPLE firmware command failures in bnxt_en, from Michael Chan.

11) Fix memory leak in act_ctinfo, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
  cxgb4: reject overlapped queues in TC-MQPRIO offload
  cxgb4: fix Tx multi channel port rate limit
  net: sched: act_ctinfo: fix memory leak
  bnxt_en: Do not treat DSN (Digital Serial Number) read failure as fatal.
  bnxt_en: Fix ipv6 RFS filter matching logic.
  bnxt_en: Fix NTUPLE firmware command failures.
  net: systemport: Fixed queue mapping in internal ring map
  net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec
  net: dsa: sja1105: Don't error out on disabled ports with no phy-mode
  net: phy: dp83867: Set FORCE_LINK_GOOD to default after reset
  net: hns: fix soft lockup when there is not enough memory
  net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key()
  net/sched: act_ife: initalize ife->metalist earlier
  netfilter: nat: fix ICMP header corruption on ICMP errors
  net: wan: lapbether.c: Use built-in RCU list checking
  netfilter: nf_tables: fix flowtable list del corruption
  netfilter: nf_tables: fix memory leak in nf_tables_parse_netdev_hooks()
  netfilter: nf_tables: remove WARN and add NLA_STRING upper limits
  netfilter: nft_tunnel: ERSPAN_VERSION must not be null
  netfilter: nft_tunnel: fix null-attribute check
  ...
2020-01-19 12:03:53 -08:00
Linus Torvalds
5f43644394 Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "Two runtime PM fixes and one leak fix"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: iop3xx: Fix memory leak in probe error path
  i2c: tegra: Properly disable runtime PM on driver's probe error
  i2c: tegra: Fix suspending in active runtime PM state
2020-01-19 12:02:06 -08:00
Rahul Lakkireddy
b2383ad987 cxgb4: reject overlapped queues in TC-MQPRIO offload
A queue can't belong to multiple traffic classes. So, reject
any such configuration that results in overlapped queues for a
traffic class.

Fixes: b1396c2bd6 ("cxgb4: parse and configure TC-MQPRIO offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:12:53 +01:00
Rahul Lakkireddy
c856e2b6fc cxgb4: fix Tx multi channel port rate limit
T6 can support 2 egress traffic management channels per port to
double the total number of traffic classes that can be configured.
In this configuration, if the class belongs to the other channel,
then all the queues must be bound again explicitly to the new class,
for the rate limit parameters on the other channel to take effect.

So, always explicitly bind all queues to the port rate limit traffic
class, regardless of the traffic management channel that it belongs
to. Also, only bind queues to port rate limit traffic class, if all
the queues don't already belong to an existing different traffic
class.

Fixes: 4ec4762d8e ("cxgb4: add TC-MATCHALL classifier egress offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:12:02 +01:00
Eric Dumazet
09d4f10a5e net: sched: act_ctinfo: fix memory leak
Implement a cleanup method to properly free ci->params

BUG: memory leak
unreferenced object 0xffff88811746e2c0 (size 64):
  comm "syz-executor617", pid 7106, jiffies 4294943055 (age 14.250s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    c0 34 60 84 ff ff ff ff 00 00 00 00 00 00 00 00  .4`.............
  backtrace:
    [<0000000015aa236f>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
    [<0000000015aa236f>] slab_post_alloc_hook mm/slab.h:586 [inline]
    [<0000000015aa236f>] slab_alloc mm/slab.c:3320 [inline]
    [<0000000015aa236f>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3549
    [<000000002c946bd1>] kmalloc include/linux/slab.h:556 [inline]
    [<000000002c946bd1>] kzalloc include/linux/slab.h:670 [inline]
    [<000000002c946bd1>] tcf_ctinfo_init+0x21a/0x530 net/sched/act_ctinfo.c:236
    [<0000000086952cca>] tcf_action_init_1+0x400/0x5b0 net/sched/act_api.c:944
    [<000000005ab29bf8>] tcf_action_init+0x135/0x1c0 net/sched/act_api.c:1000
    [<00000000392f56f9>] tcf_action_add+0x9a/0x200 net/sched/act_api.c:1410
    [<0000000088f3c5dd>] tc_ctl_action+0x14d/0x1bb net/sched/act_api.c:1465
    [<000000006b39d986>] rtnetlink_rcv_msg+0x178/0x4b0 net/core/rtnetlink.c:5424
    [<00000000fd6ecace>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477
    [<0000000047493d02>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442
    [<00000000bdcf8286>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
    [<00000000bdcf8286>] netlink_unicast+0x223/0x310 net/netlink/af_netlink.c:1328
    [<00000000fc5b92d9>] netlink_sendmsg+0x2c0/0x570 net/netlink/af_netlink.c:1917
    [<00000000da84d076>] sock_sendmsg_nosec net/socket.c:639 [inline]
    [<00000000da84d076>] sock_sendmsg+0x54/0x70 net/socket.c:659
    [<0000000042fb2eee>] ____sys_sendmsg+0x2d0/0x300 net/socket.c:2330
    [<000000008f23f67e>] ___sys_sendmsg+0x8a/0xd0 net/socket.c:2384
    [<00000000d838e4f6>] __sys_sendmsg+0x80/0xf0 net/socket.c:2417
    [<00000000289a9cb1>] __do_sys_sendmsg net/socket.c:2426 [inline]
    [<00000000289a9cb1>] __se_sys_sendmsg net/socket.c:2424 [inline]
    [<00000000289a9cb1>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2424

Fixes: 24ec483cec ("net: sched: Introduce act_ctinfo action")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Kevin 'ldir' Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Acked-by: Kevin 'ldir' Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-19 16:02:15 +01:00
Olof Johansson
fc585d4a5c riscv: Less inefficient gcc tishift helpers (and export their symbols)
The existing __lshrti3 was really inefficient, and the other two helpers
are also needed to compile some modules.

Add the missing versions, and export all of the symbols like arm64
already does.

This code is based on the assembly generated by libgcc builds.

This fixes a build break triggered by ubsan:

riscv64-unknown-linux-gnu-ld: lib/ubsan.o: in function `.L2':
ubsan.c:(.text.unlikely+0x38): undefined reference to `__ashlti3'
riscv64-unknown-linux-gnu-ld: ubsan.c:(.text.unlikely+0x42): undefined reference to `__ashrti3'

Signed-off-by: Olof Johansson <olof@lixom.net>
[paul.walmsley@sifive.com: use SYM_FUNC_{START,END} instead of
 ENTRY/ENDPROC; note libgcc origin]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-18 19:13:41 -08:00
Linus Torvalds
8f8972a312 Raw NAND:
* GPMI: Fix the suspend/resume
 
 SPI-NOR:
 * Fix quad enable on Spansion like flashes
 * Fix selection of 4-byte addressing opcodes on Spansion
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEE9HuaYnbmDhq/XIDIJWrqGEe9VoQFAl4jghcACgkQJWrqGEe9
 VoT7JAf9HTmCV03HW7/T46N+HxgRDwpMyHqxir6gEAHRlg7YnrsY6wLsChwtr1HQ
 CqXtNT7FWm5KxSSTNy+m2XbMOIu9CRsymuAWQRNFlSor90o/UjJGyGSqeXZebvlL
 nITGe089SfprmaPI9MAYeL5QdhfqX1Z6BBIauycu+9zeSEDlEMOGfPZ9GEGqxJD4
 KDzdTTHIznIV/esKuc38w+mv/pVRoYdc/DVrE6MwGTtC8MhPgIYymENk7Gna3EAR
 HxGhk+blCVCkgYKhMOTthpR/QMrCKkQrRlaaVO2EJALUr/b6fi1D2QpfhdFJILXA
 0Ku7JlBZ7lMnaKa1ty+wf9tyOYNHLA==
 =2wqK
 -----END PGP SIGNATURE-----

Merge tag 'mtd/fixes-for-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux

Pull MTD fixes from Miquel Raynal:
 "Raw NAND:
   - GPMI: Fix the suspend/resume

  SPI-NOR:
   - Fix quad enable on Spansion like flashes
   - Fix selection of 4-byte addressing opcodes on Spansion"

* tag 'mtd/fixes-for-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: gpmi: Restore nfc timing setup after suspend/resume
  mtd: rawnand: gpmi: Fix suspend/resume problem
  mtd: spi-nor: Fix quad enable for Spansion like flashes
  mtd: spi-nor: Fix selection of 4-byte addressing opcodes on Spansion
2020-01-18 16:34:17 -08:00
Linus Torvalds
244dc26890 drm fixes for 5.5-rc7
core mst:
 - serialize down messages and clear timeslots are on unplug
 
 amdgpu:
 - Update golden settings for renoir
 - eDP fix
 
 i915:
 - uAPI fix: Remove dash and colon from PMU names to comply with tools/perf
 - Fix for include file that was indirectly included
 - Two fixes to make sure VMA are marked active for error capture
 
 virtio:
 - maintain obj reservation lock when submitting cmds
 
 rockchip:
 - increase link rate var size to accommodate rates
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJeI3ZMAAoJEAx081l5xIa+iuUQAIIMTL86+QPr3CSu8fU7albJ
 MQWOzycWn0xP0KNYIvKz2X6dePWAEDi2kMM+6M/htzi32XmyG0oiAat2wAGkK3ir
 AtfOnbKdD6T0i5wwkAn8SvNYHOTHdg7HMKTzPw2rxlTlZ8IEdVxo1CZqHZMFg0PV
 UiK2VWVBPmBwdRgLu5K8XJy5LKbpx9BMWW5KdS96L2lKNVV3qeH3Q7NvAh1PtJp4
 W+83OsYxYNwS0//4v0i0MjCGVXq5qVw7tzplAPW2OpUCMZ4TtVUnW2/RcrqvU9/b
 v18M7thf2uccMXs7E1gyHgvrZp2lw6KsyJEpPnIp4k+2OXM0BVe9l4Pl0uUQhrDH
 EUvCkpmP1/Zu44CJhtmVAZlVyy3ShNY5NeF6OsCZDSL7z60YMZHFVKF1IgrBkOJF
 mXXq4eb0rMh3At6ABNliRo/bzEYfGgVy1xxNzOoTKCp6pNK/OGy4xQ/DEAkZbBBx
 ul6E+NjaHhul1daWqpuv37oxsjE0RI7asFBomC5HTNWVZHY8MN8+9p/TTZQJ8n0l
 yxMlPy2OOWVfwclwRSLant3Ny1hXNysMpWXMtrEtcB3Q5GI7H4ZCNh9b+VUPLjXT
 JPZU2ZSIjrYNBeKYjVGV75HKjyMWJv1aB3kiXq7lX1yFq7wQLd5nE5mjTSD9wr1I
 ZnhFSbHLcR7VMWUldPPi
 =Vwa3
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Back from LCA2020, fixes wasn't too busy last week, seems to have
  quieten down appropriately, some amdgpu, i915, then a core mst fix and
  one fix for virtio-gpu and one for rockchip:

  core mst:
   - serialize down messages and clear timeslots are on unplug

  amdgpu:
   - Update golden settings for renoir
   - eDP fix

  i915:
   - uAPI fix: Remove dash and colon from PMU names to comply with
     tools/perf
   - Fix for include file that was indirectly included
   - Two fixes to make sure VMA are marked active for error capture

  virtio:
   - maintain obj reservation lock when submitting cmds

  rockchip:
   - increase link rate var size to accommodate rates"

* tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: Reorder detect_edp_sink_caps before link settings read.
  drm/amdgpu: update goldensetting for renoir
  drm/dp_mst: Have DP_Tx send one msg at a time
  drm/dp_mst: clear time slots for ports invalid
  drm/i915/pmu: Do not use colons or dashes in PMU names
  drm/rockchip: fix integer type used for storing dp data rate
  drm/i915/gt: Mark ring->vma as active while pinned
  drm/i915/gt: Mark context->state vma as active while pinned
  drm/i915/gt: Skip trying to unbind in restore_ggtt_mappings
  drm/i915: Add missing include file <linux/math64.h>
  drm/virtio: add missing virtio_gpu_array_lock_resv call
2020-01-18 13:57:31 -08:00
Ilie Halip
95f4d9cced riscv: delete temporary files
Temporary files used in the VDSO build process linger on even after make
mrproper: vdso-dummy.o.tmp, vdso.so.dbg.tmp.

Delete them once they're no longer needed.

Signed-off-by: Ilie Halip <ilie.halip@gmail.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2020-01-18 13:22:13 -08:00
Linus Torvalds
0cc2682d8b Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes:

   - a resctrl fix for uninitialized objects found by debugobjects

   - a resctrl memory leak fix

   - fix the unintended re-enabling of the of SME and SEV CPU flags if
     memory encryption was disabled at bootup via the MSR space"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/CPU/AMD: Ensure clearing of SME/SEV features is maintained
  x86/resctrl: Fix potential memory leak
  x86/resctrl: Fix an imbalance in domain_remove_cpu()
2020-01-18 13:02:12 -08:00
Linus Torvalds
7ff15cd045 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
 "Three fixes: fix link failure on Alpha, fix a Sparse warning and
  annotate/robustify a lockless access in the NOHZ code"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick/sched: Annotate lockless access to last_jiffies_update
  lib/vdso: Make __cvdso_clock_getres() static
  time/posix-stubs: Provide compat itimer supoprt for alpha
2020-01-18 13:00:59 -08:00
Linus Torvalds
9e79c52332 Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cpu/SMT fix from Ingo Molnar:
 "Fix a build bug on CONFIG_HOTPLUG_SMT=y && !CONFIG_SYSFS kernels"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/SMT: Fix x86 link error without CONFIG_SYSFS
2020-01-18 12:57:41 -08:00
Linus Torvalds
a186c112c7 Merge branch 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 RAS fix from Ingo Molnar:
 "Fix a thermal throttling race that can result in easy to trigger boot
  crashes on certain Ice Lake platforms"

* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce/therm_throt: Do not access uninitialized therm_work
2020-01-18 12:56:36 -08:00
Linus Torvalds
b07b9e8d63 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Tooling fixes, three Intel uncore driver fixes, plus an AUX events fix
  uncovered by the perf fuzzer"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Remove PCIe3 unit for SNR
  perf/x86/intel/uncore: Fix missing marker for snr_uncore_imc_freerunning_events
  perf/x86/intel/uncore: Add PCI ID of IMC for Xeon E3 V5 Family
  perf: Correctly handle failed perf_get_aux_event()
  perf hists: Fix variable name's inconsistency in hists__for_each() macro
  perf map: Set kmap->kmaps backpointer for main kernel map chunks
  perf report: Fix incorrectly added dimensions as switch perf data file
  tools lib traceevent: Fix memory leakage in filter_event
2020-01-18 12:55:19 -08:00
Linus Torvalds
124b5547ec Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "Three fixes:

    - Fix an rwsem spin-on-owner crash, introduced in v5.4

    - Fix a lockdep bug when running out of stack_trace entries,
      introduced in v5.4

    - Docbook fix"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem: Fix kernel crash when spinning on RWSEM_OWNER_UNKNOWN
  futex: Fix kernel-doc notation warning
  locking/lockdep: Fix buffer overrun problem in stack_trace[]
2020-01-18 12:53:28 -08:00
Linus Torvalds
a1c6f87efc Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
 "Fix a recent regression in the Ingenic SoCs irqchip driver that floods
  the syslog"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/ingenic: Get rid of the legacy IRQ domain
2020-01-18 12:52:18 -08:00
Linus Torvalds
e2f73d1e52 Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
 "Three EFI fixes:

   - Fix a slow-boot-scrolling regression but making sure we use WC for
     EFI earlycon framebuffer mappings on x86

   - Fix a mixed EFI mode boot crash

   - Disable paging explicitly before entering startup_32() in mixed
     mode bootup"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efistub: Disable paging at mixed mode entry
  efi/libstub/random: Initialize pointer variables to zero for mixed mode
  efi/earlycon: Fix write-combine mapping on x86
2020-01-18 12:50:14 -08:00
Linus Torvalds
ba0f472203 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull rseq fixes from Ingo Molnar:
 "Two rseq bugfixes:

   - CLONE_VM !CLONE_THREAD didn't work properly, the kernel would end
     up corrupting the TLS of the parent. Technically a change in the
     ABI but the previous behavior couldn't resonably have been relied
     on by applications so this looks like a valid exception to the ABI
     rule.

   - Make the RSEQ_FLAG_UNREGISTER ABI behavior consistent with the
     handling of other flags. This is not thought to impact any
     applications either"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rseq: Unregister rseq for clone CLONE_VM
  rseq: Reject unknown flags on rseq unregister
2020-01-18 12:29:13 -08:00
Linus Torvalds
8cac89909a for-linus-2020-01-18
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXiL/qwAKCRCRxhvAZXjc
 oln5AP9ITypHs2iNWl1Cbte++y2iflWevDyPUrmagegqpKwbJAD9EypY0RVDor8T
 LXWK4WaNgB0K0MK/gSPRAlgx9ejNwA4=
 =6xXo
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-2020-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull thread fixes from Christian Brauner:
 "Here is an urgent fix for ptrace_may_access() permission checking.

  Commit 69f594a389 ("ptrace: do not audit capability check when
  outputing /proc/pid/stat") introduced the ability to opt out of audit
  messages for accesses to various proc files since they are not
  violations of policy.

  While doing so it switched the check from ns_capable() to
  has_ns_capability{_noaudit}(). That means it switched from checking
  the subjective credentials (ktask->cred) of the task to using the
  objective credentials (ktask->real_cred). This is appears to be wrong.
  ptrace_has_cap() is currently only used in ptrace_may_access() And is
  used to check whether the calling task (subject) has the
  CAP_SYS_PTRACE capability in the provided user namespace to operate on
  the target task (object). According to the cred.h comments this means
  the subjective credentials of the calling task need to be used.

  With this fix we switch ptrace_has_cap() to use security_capable() and
  thus back to using the subjective credentials.

  As one example where this might be particularly problematic, Jann
  pointed out that in combination with the upcoming IORING_OP_OPENAT{2}
  feature, this bug might allow unprivileged users to bypass the
  capability checks while asynchronously opening files like /proc/*/mem,
  because the capability checks for this would be performed against
  kernel credentials.

  To illustrate on the former point about this being exploitable: When
  io_uring creates a new context it records the subjective credentials
  of the caller. Later on, when it starts to do work it creates a kernel
  thread and registers a callback. The callback runs with kernel creds
  for ktask->real_cred and ktask->cred.

  To prevent this from becoming a full-blown 0-day io_uring will call
  override_cred() and override ktask->cred with the subjective
  credentials of the creator of the io_uring instance. With
  ptrace_has_cap() currently looking at ktask->real_cred this override
  will be ineffective and the caller will be able to open arbitray proc
  files as mentioned above.

  Luckily, this is currently not exploitable but would be so once
  IORING_OP_OPENAT{2} land in v5.6. Let's fix it now.

  To minimize potential regressions I successfully ran the criu
  testsuite. criu makes heavy use of ptrace() and extensively hits
  ptrace_may_access() codepaths and has a good change of detecting any
  regressions.

  Additionally, I succesfully ran the ptrace and seccomp kernel tests"

* tag 'for-linus-2020-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  ptrace: reintroduce usage of subjective credentials in ptrace_has_cap()
2020-01-18 12:23:31 -08:00
Linus Torvalds
2324de6fab s390 updates for 5.5-rc7
- Fix printing misleading Secure-IPL enabled message when it is not.
 
 - Fix a race condition between host ap bus and guest ap bus doing
   device reset in crypto code.
 
 - Fix sanity check in CCA cipher key function (CCA AES cipher key
   support), which fails otherwise.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl4i+UgACgkQjYWKoQLX
 FBg0Pgf/VYdOoZCo868B6H7jR3xbPmaMz5wA57QfbPhmUkdF7ELBSDj7j+dvfmEy
 kb9Tkn2cZOj40H0jwKWawnz5HA+iKtkXBcB8THB55ThlD6hgpZo8nzul1EomcxDx
 NY7vxlaIpgWZgpT91Z1JxKOLpxIo0mIUH8+XITgZBU4qV9yrycpePzV7kz5vEieG
 9BFObo0VtNYBnNwm68YuXsg1FKkezWktFAmdLHvxQz6wXUL5kBb/1dZkKGkg/0Qc
 jsJSglY2I2uCqDesIRHxl6waWgj79nrb3T5WWYh9IL6btXfAFDTNfx7WaCWU55kd
 pCfViqDJYL3idW8AfPkgXNaSclWtaQ==
 =q9ev
 -----END PGP SIGNATURE-----

Merge tag 's390-5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Vasily Gorbik:

 - Fix printing misleading Secure-IPL enabled message when it is not.

 - Fix a race condition between host ap bus and guest ap bus doing
   device reset in crypto code.

 - Fix sanity check in CCA cipher key function (CCA AES cipher key
   support), which fails otherwise.

* tag 's390-5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/setup: Fix secure ipl message
  s390/zcrypt: move ap device reset from bus to driver code
  s390/zcrypt: Fix CCA cipher key gen with clear key value function
2020-01-18 12:18:55 -08:00
Linus Torvalds
8965de70cb SCSI fixes on 20200118
Three fixes in drivers with no impact to core code. The mptfusion fix
 is enormous because the driver API had to be rethreaded to pass down
 the necessary iocp pointer, but once that's done a significant chunk
 of code is deleted.  The other two patches are small.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXiNL7iYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishd6OAP0dPFlB
 Nd8g0PR7OV4L6DpTcR2v6gaYZRoLeq0+p3b7yQD9EL5LiESIciLV2nk8pMoQVEqg
 i7Pz5L2RJnLDRoUgIgU=
 =YvR2
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Three fixes in drivers with no impact to core code.

  The mptfusion fix is enormous because the driver API had to be
  rethreaded to pass down the necessary iocp pointer, but once that's
  done a significant chunk of code is deleted.

  The other two patches are small"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: mptfusion: Fix double fetch bug in ioctl
  scsi: storvsc: Correctly set number of hardware queues for IDE disk
  scsi: fnic: fix invalid stack access
2020-01-18 12:12:36 -08:00