Commit Graph

1042671 Commits

Author SHA1 Message Date
Christoph Hellwig
f93a181af4 bvec: add memcpy_{from,to}_bvec and memzero_bvec helper
Add helpers to perform common memory operation on a bvec.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20210727055646.118787-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02 13:37:27 -06:00
Christoph Hellwig
e6e7471706 bvec: add a bvec_kmap_local helper
Add a helper to call kmap_local_page on a bvec.  There is no need for
an unmap helper given that kunmap_local accept any address in the mapped
page.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20210727055646.118787-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02 13:37:27 -06:00
Christoph Hellwig
e45cef51db bvec: fix the include guards for bvec.h
Fix the include guards to match the file naming.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20210727055646.118787-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02 13:37:27 -06:00
Christoph Hellwig
4c7251e1b5 MIPS: don't include <linux/genhd.h> in <asm/mach-rc32434/rb.h>
There is no need to include genhd.h from a random arch header, and not
doing so prevents the possibility for nasty include loops.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20210727055646.118787-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02 13:37:27 -06:00
Oliver Hartkopp
06447ae5e3 ioprio: move user space relevant ioprio bits to UAPI includes
systemd added a modified copy of include/linux/ioprio.h into its
code to get the relevant content definitions for the exposed
ioprio_[get|set] system calls.

Move the user space relevant ioprio bits to the UAPI includes to be
able to use the ioprio_[get|set] syscalls as intended.

Cc: Kay Sievers <kay@vrfy.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/r/20210714195655.181943-1-socketcan@hartkopp.net
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-02 13:37:27 -06:00
Zack Rusin
e89afb51f9 drm/vmwgfx: Fix a 64bit regression on svga3
Register accesses are always 4bytes, accidently this was changed to
a void pointer whwqich badly breaks 64bit archs when running on top
of svga3.

Fixes: 2cd80dbd35 ("drm/vmwgfx: Add basic support for SVGA3")
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Martin Krastev <krastevm@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210615182336.995192-3-zackr@vmware.com
(cherry picked from commit 87360168759879d68550b0c052bbcc2a0339ff74)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2021-08-02 21:00:37 +02:00
Laibin Qiu
dc675a9712 f2fs: fix min_seq_blocks can not make sense in some scenes.
F2FS have dirty page count control for batched sequential
write in writepages, and get the value of min_seq_blocks by
blocks_per_seg * segs_per_sec(segs_per_sec defaults to 1).
But in some scenes we set a lager section size, Min_seq_blocks
will become too large to achieve the expected effect(eg. 4thread
sequential write, the number of merge requests will be reduced).

Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-02 11:24:26 -07:00
Chao Yu
2787991516 f2fs: fix to force keeping write barrier for strict fsync mode
[1] https://www.mail-archive.com/linux-f2fs-devel@lists.sourceforge.net/msg15126.html

As [1] reported, if lower device doesn't support write barrier, in below
case:

- write page #0; persist
- overwrite page #0
- fsync
 - write data page #0 OPU into device's cache
 - write inode page into device's cache
 - issue flush

If SPO is triggered during flush command, inode page can be persisted
before data page #0, so that after recovery, inode page can be recovered
with new physical block address of data page #0, however there may
contains dummy data in new physical block address.

Then what user will see is: after overwrite & fsync + SPO, old data in
file was corrupted, if any user do care about such case, we can suggest
user to use STRICT fsync mode, in this mode, we will force to use atomic
write sematics to keep write order in between data/node and last node,
so that it avoids potential data corruption during fsync().

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-02 11:24:26 -07:00
Chao Yu
277afbde6c f2fs: fix wrong checkpoint_changed value in f2fs_remount()
In f2fs_remount(), return value of test_opt() is an unsigned int type
variable, however when we compare it to a bool type variable, it cause
wrong result, fix it.

Fixes: 4354994f09 ("f2fs: checkpoint disabling")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-02 11:24:26 -07:00
Jaegeuk Kim
2e650912c0 f2fs: show sbi status in debugfs/f2fs/status
We need to get sbi->s_flag to understand the current f2fs status as well.
One example is SBI_NEED_FSCK.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-02 11:24:26 -07:00
Daeho Jeong
4931e0c93e f2fs: turn back remapped address in compressed page endio
Turned back the remmaped sector address to the address in the partition,
when ending io, for compress cache to work properly.

Fixes: 6ce19aff0b ("f2fs: compress: add compress_inode to cache
compressed blocks")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Youngjin Gil <youngjin.gil@samsung.com>
Signed-off-by: Hyeong Jun Kim <hj514.kim@samsung.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-02 11:24:25 -07:00
Daeho Jeong
093f0bac32 f2fs: change fiemap way in printing compression chunk
When we print out a discontinuous compression chunk, it shows like a
continuous chunk now. To show it more correctly, I've changed the way of
printing fiemap info like below. Plus, eliminated NEW_ADDR(-1) in fiemap
info, since it is not in fiemap user api manual.

Let's assume 16KB compression cluster.

<before>
   Logical          Physical         Length           Flags
0:  0000000000000000 00000002c091f000 0000000000004000 1008
1:  0000000000004000 00000002c0920000 0000000000004000 1008
  ...
9:  0000000000034000 0000000f8c623000 0000000000004000 1008
10: 0000000000038000 000000101a6eb000 0000000000004000 1008

<after>
0:  0000000000000000 00000002c091f000 0000000000004000 1008
1:  0000000000004000 00000002c0920000 0000000000004000 1008
  ...
9:  0000000000034000 0000000f8c623000 0000000000001000 1008
10: 0000000000035000 000000101a6ea000 0000000000003000 1008
11: 0000000000038000 000000101a6eb000 0000000000002000 1008
12: 000000000003a000 00000002c3544000 0000000000002000 1008

Flags
0x1000 => FIEMAP_EXTENT_MERGED
0x0008 => FIEMAP_EXTENT_ENCODED

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Tested-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-02 11:24:25 -07:00
Jaegeuk Kim
b7ec206173 f2fs: do not submit NEW_ADDR to read node block
After the below patch, give cp is errored, we drop dirty node pages. This
can give NEW_ADDR to read node pages. Don't do WARN_ON() which gives
generic/475 failure.

Fixes: 28607bf3aa ("f2fs: drop dirty node pages when cp is in error status")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-02 11:24:25 -07:00
Fengnan Chang
7eab7a6968 f2fs: compress: remove unneeded read when rewrite whole cluster
when we overwrite the whole page in cluster, we don't need read original
data before write, because after write_end(), writepages() can help to
load left data in that cluster.

Signed-off-by: Fengnan Chang <changfengnan@vivo.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Acked-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2021-08-02 11:24:25 -07:00
Mark Rutland
7672150301 arm64: kasan: mte: remove redundant mte_report_once logic
We have special logic to suppress MTE tag check fault reporting, based
on a global `mte_report_once` and `reported` variables. These can be
used to suppress calling kasan_report() when taking a tag check fault,
but do not prevent taking the fault in the first place, nor does they
affect the way we disable tag checks upon taking a fault.

The core KASAN code already defaults to reporting a single fault, and
has a `multi_shot` control to permit reporting multiple faults. The only
place we transiently alter `mte_report_once` is in lib/test_kasan.c,
where we also the `multi_shot` state as the same time. Thus
`mte_report_once` and `reported` are redundant, and can be removed.

When a tag check fault is taken, tag checking will be disabled by
`do_tag_recovery` and must be explicitly re-enabled if desired. The test
code does this by calling kasan_enable_tagging_sync().

This patch removes the redundant mte_report_once() logic and associated
variables.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20210714143843.56537-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-02 18:15:28 +01:00
Mark Rutland
8286824789 arm64: kasan: mte: use a constant kernel GCR_EL1 value
When KASAN_HW_TAGS is selected, KASAN is enabled at boot time, and the
hardware supports MTE, we'll initialize `kernel_gcr_excl` with a value
dependent on KASAN_TAG_MAX. While the resulting value is a constant
which depends on KASAN_TAG_MAX, we have to perform some runtime work to
generate the value, and have to read the value from memory during the
exception entry path. It would be better if we could generate this as a
constant at compile-time, and use it as such directly.

Early in boot within __cpu_setup(), we initialize GCR_EL1 to a safe
value, and later override this with the value required by KASAN. If
CONFIG_KASAN_HW_TAGS is not selected, or if KASAN is disabeld at boot
time, the kernel will not use IRG instructions, and so the initial value
of GCR_EL1 is does not matter to the kernel. Thus, we can instead have
__cpu_setup() initialize GCR_EL1 to a value consistent with
KASAN_TAG_MAX, and avoid the need to re-initialize it during hotplug and
resume form suspend.

This patch makes arem64 use a compile-time constant KERNEL_GCR_EL1
value, which is compatible with KASAN_HW_TAGS when this is selected.
This removes the need to re-initialize GCR_EL1 dynamically, and acts as
an optimization to the entry assembly, which no longer needs to load
this value from memory. The redundant initialization hooks are removed.

In order to do this, KASAN_TAG_MAX needs to be visible outside of the
core KASAN code. To do this, I've moved the KASAN_TAG_* values into
<linux/kasan-tags.h>.

There should be no functional change as a result of this patch.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@gmail.com>
Link: https://lore.kernel.org/r/20210714143843.56537-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-02 18:14:21 +01:00
Marc Zyngier
d21faba116 PCI: Bulk conversion to generic_handle_domain_irq()
Wherever possible, replace constructs that match either
generic_handle_irq(irq_find_mapping()) or
generic_handle_irq(irq_linear_revmap()) to a single call to
generic_handle_domain_irq().

Link: https://lore.kernel.org/r/20210802162630.2219813-4-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
2021-08-02 11:53:05 -05:00
Mark Brown
b24b520509 arm64/sve: Make fpsimd_bind_task_to_cpu() static
This function is not referenced outside fpsimd.c so can be static, making
it that little bit easier to follow what is called from where.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210730165846.18558-1-broonie@kernel.org
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-02 17:48:16 +01:00
Colin Ian King
1187c8c464 net: phy: mscc: make some arrays static const, makes object smaller
Don't populate arrays on the stack but instead them static const.
Makes the object code smaller by 280 bytes.

Before:
   text    data     bss     dec     hex filename
  24142    4368     192   28702    701e ./drivers/net/phy/mscc/mscc_ptp.o

After:
   text    data     bss     dec     hex filename
  23830    4400     192   28422    6f06 ./drivers/net/phy/mscc/mscc_ptp.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210801070155.139057-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-02 09:15:07 -07:00
Colin Ian King
f36c82ac1b netdevsim: make array res_ids static const, makes object smaller
Don't populate the array res_ids on the stack but instead it
static const. Makes the object code smaller by 14 bytes.

Before:
   text    data     bss     dec     hex filename
  50833    8314     256   59403    e80b ./drivers/net/netdevsim/fib.o

After:
   text    data     bss     dec     hex filename
  50755    8378     256   59389    e7fd ./drivers/net/netdevsim/fib.o

(gcc version 10.2.0)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210801065328.138906-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-02 09:12:24 -07:00
Greg Kroah-Hartman
232eee380e Merge tag 'fpga-fixes-for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga into char-misc-linus
Moritz writes:

FPGA Manager fix for 5.14

Kajol's fix adds a missing pmu_migrate_context() call which presents a
problem if the CPU collecting FME PMU data is taken offline.

All patches have been reviewed on the mailing list, and have been in the
last few linux-next releases (as part of my fixes branch) without issues.

Signed-off-by: Moritz Fischer <mdf@kernel.org>

* tag 'fpga-fixes-for-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/mdf/linux-fpga:
  fpga: dfl: fme: Fix cpu hotplug issue in performance reporting
2021-08-02 18:07:23 +02:00
Bob Pearson
ef4b96a577 RDMA/rxe: Restore setting tot_len in the IPv4 header
An earlier patch removed setting of tot_len in IPv4 headers because it was
also set in ip_local_out. However, this change resulted in an incorrect
ICRC being computed because the tot_len field is not masked out. This
patch restores that line. This fixes the bug reported by Zhu Yanjun.  This
bug affects anyone using rxe which is currently broken.

Fixes: 230bb836ee ("RDMA/rxe: Fix redundant call to ip_send_check")
Link: https://lore.kernel.org/r/20210729220039.18549-3-rpearsonhpe@gmail.com
Reported-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Reviewed-and-tested-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-02 12:45:22 -03:00
Bob Pearson
e2a05339fa RDMA/rxe: Use the correct size of wqe when processing SRQ
The memcpy() that copies a WQE from a SRQ the QP uses an incorrect size.
The size should have been the size of the rxe_send_wqe struct not the size
of a pointer to it. The result is that IO operations using a SRQ on the
responder side will fail.

Fixes: ec0fa2445c ("RDMA/rxe: Fix over copying in get_srq_wqe")
Link: https://lore.kernel.org/r/20210729220039.18549-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-02 12:45:22 -03:00
Mike Marciniszyn
db4657afd1 RDMA/cma: Revert INIT-INIT patch
The net/sunrpc/xprtrdma module creates its QP using rdma_create_qp() and
immediately post receives, implicitly assuming the QP is in the INIT state
and thus valid for ib_post_recv().

The patch noted in Fixes: removed the RESET->INIT modifiy from
rdma_create_qp(), breaking NFS rdma for verbs providers that fail the
ib_post_recv() for a bad state.

This situation was proven using kprobes in rvt_post_recv() and
rvt_modify_qp(). The traces showed that the rvt_post_recv() failed before
ANY modify QP and that the current state was RESET.

Fix by reverting the patch below.

Fixes: dc70f7c3ed ("RDMA/cma: Remove unnecessary INIT->INIT transition")
Link: https://lore.kernel.org/r/1627583182-81330-1-git-send-email-mike.marciniszyn@cornelisnetworks.com
Cc: Haakon Bugge <haakon.bugge@oracle.com>
Cc: Chuck Lever III <chuck.lever@oracle.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-02 12:45:22 -03:00
Aharon Landau
d6793ca97b RDMA/mlx5: Delay emptying a cache entry when a new MR is added to it recently
Fixing a typo that causes a cache entry to shrink immediately after adding
to it new MRs if the entry size exceeds the high limit.  In doing so, the
cache misses its purpose to prevent the creation of new mkeys on the
runtime by using the cached ones.

Fixes: b9358bdbc7 ("RDMA/mlx5: Fix locking in MR cache work queue")
Link: https://lore.kernel.org/r/fcb546986be346684a016f5ca23a0567399145fa.1627370131.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-02 12:45:22 -03:00
Matt Roper
82929a2140 drm/i915: Correct SFC_DONE register offset
The register offset for SFC_DONE was missing a '0' at the end, causing
us to read from a non-existent register address.  We only use this
register in error state dumps so the mistake hasn't caused any real
problems, but fixing it will hopefully make the error state dumps a bit
more useful for debugging.

Fixes: e50dbdbfd9 ("drm/i915/tgl: Add SFC instdone to error state")
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728233411.2365788-1-matthew.d.roper@intel.com
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2021-08-02 08:29:11 -07:00
Matthias Schiffer
9b87f43537 gpio: tqmx86: really make IRQ optional
The tqmx86 MFD driver was passing IRQ 0 for "no IRQ" in the past. This
causes warnings with newer kernels.

Prepare the gpio-tqmx86 driver for the fixed MFD driver by handling a
missing IRQ properly.

Fixes: b868db94a6 ("gpio: tqmx86: Add GPIO from for this IO controller")
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-08-02 17:17:27 +02:00
Joerg Roedel
47a70bea54 iommu/amd: Remove stale amd_iommu_unmap_flush usage
Remove the new use of the variable introduced in the AMD driver branch.
The variable was removed already in the iommu core branch, causing build
errors when the brances are merged.

Cc: Nadav Amit <namit@vmware.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Link: https://lore.kernel.org/r/20210802150643.3634-1-joro@8bytes.org
2021-08-02 17:07:39 +02:00
Rajat Jain
1651d9e781 thunderbolt: Add authorized value to the KOBJ_CHANGE uevent
For security reasons, we would like to monitor and track when the
Thunderbolt devices are authorized and deauthorized (i.e. when the
Thunderbolt sysfs "authorized" attribute changes). Currently the
userspace gets a udev change notification when there is a change, but
the state may have changed (again) by the time we look at the authorized
attribute in sysfs. So an authorization event may go unnoticed. Thus
make it easier by informing the actual change (new value of authorized
attribute) in the udev change notification.

The change is included as a key value "authorized=<val>" where <val>
is the new value of sysfs attribute "authorized", and is described at
Documentation/ABI/testing/sysfs-bus-thunderbolt under
/sys/bus/thunderbolt/devices/.../authorized.

Signed-off-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2021-08-02 18:03:36 +03:00
mark-yw.chen
654e6f7700 Bluetooth: btusb: Enable MSFT extension for Mediatek Chip (MT7921)
The Mdiatek MT7921(7961) support MSFT HCI extensions, we are using
0xFD30 for VsMsftOpCode.

Signed-off-by: mark-yw.chen <mark-yw.chen@mediatek.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-08-02 17:03:14 +02:00
Paolo Bonzini
db105fab8d KVM: nSVM: remove useless kvm_clear_*_queue
For an event to be in injected state when nested_svm_vmrun executes,
it must have come from exitintinfo when svm_complete_interrupts ran:

  vcpu_enter_guest
   static_call(kvm_x86_run) -> svm_vcpu_run
    svm_complete_interrupts
     // now the event went from "exitintinfo" to "injected"
   static_call(kvm_x86_handle_exit) -> handle_exit
    svm_invoke_exit_handler
      vmrun_interception
       nested_svm_vmrun

However, no event could have been in exitintinfo before a VMRUN
vmexit.  The code in svm.c is a bit more permissive than the one
in vmx.c:

        if (is_external_interrupt(svm->vmcb->control.exit_int_info) &&
            exit_code != SVM_EXIT_EXCP_BASE + PF_VECTOR &&
            exit_code != SVM_EXIT_NPF && exit_code != SVM_EXIT_TASK_SWITCH &&
            exit_code != SVM_EXIT_INTR && exit_code != SVM_EXIT_NMI)

but in any case, a VMRUN instruction would not even start to execute
during an attempted event delivery.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:02:00 -04:00
Sean Christopherson
4c72ab5aa6 KVM: x86: Preserve guest's CR0.CD/NW on INIT
Preserve CR0.CD and CR0.NW on INIT instead of forcing them to '1', as
defined by both Intel's SDM and AMD's APM.

Note, current versions of Intel's SDM are very poorly written with
respect to INIT behavior.  Table 9-1. "IA-32 and Intel 64 Processor
States Following Power-up, Reset, or INIT" quite clearly lists power-up,
RESET, _and_ INIT as setting CR0=60000010H, i.e. CD/NW=1.  But the SDM
then attempts to qualify CD/NW behavior in a footnote:

  2. The CD and NW flags are unchanged, bit 4 is set to 1, all other bits
     are cleared.

Presumably that footnote is only meant for INIT, as the RESET case and
especially the power-up case are rather non-sensical.  Another footnote
all but confirms that:

  6. Internal caches are invalid after power-up and RESET, but left
     unchanged with an INIT.

Bare metal testing shows that CD/NW are indeed preserved on INIT (someone
else can hack their BIOS to check RESET and power-up :-D).

Reported-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-47-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:02:00 -04:00
Sean Christopherson
46f4898b20 KVM: SVM: Drop redundant clearing of vcpu->arch.hflags at INIT/RESET
Drop redundant clears of vcpu->arch.hflags in init_vmcb() since
kvm_vcpu_reset() always clears hflags, and it is also always
zero at vCPU creation time.  And of course, the second clearing
in init_vmcb() was always redundant.

Suggested-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-46-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:59 -04:00
Sean Christopherson
265e43530c KVM: SVM: Emulate #INIT in response to triple fault shutdown
Emulate a full #INIT instead of simply initializing the VMCB if the
guest hits a shutdown.  Initializing the VMCB but not other vCPU state,
much of which is mirrored by the VMCB, results in incoherent and broken
vCPU state.

Ideally, KVM would not automatically init anything on shutdown, and
instead put the vCPU into e.g. KVM_MP_STATE_UNINITIALIZED and force
userspace to explicitly INIT or RESET the vCPU.  Even better would be to
add KVM_MP_STATE_SHUTDOWN, since technically NMI can break shutdown
(and SMI on Intel CPUs).

But, that ship has sailed, and emulating #INIT is the next best thing as
that has at least some connection with reality since there exist bare
metal platforms that automatically INIT the CPU if it hits shutdown.

Fixes: 46fe4ddd9d ("[PATCH] KVM: SVM: Propagate cpu shutdown events to userspace")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-45-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:59 -04:00
Sean Christopherson
e54949408a KVM: VMX: Move RESET-only VMWRITE sequences to init_vmcs()
Move VMWRITE sequences in vmx_vcpu_reset() guarded by !init_event into
init_vmcs() to make it more obvious that they're, uh, initializing the
VMCS.

No meaningful functional change intended (though the order of VMWRITEs
and whatnot is different).

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-44-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:59 -04:00
Sean Christopherson
7aa13fc3d8 KVM: VMX: Remove redundant write to set vCPU as active at RESET/INIT
Drop a call to vmx_clear_hlt() during vCPU INIT, the guest's activity
state is unconditionally set to "active" a few lines earlier in
vmx_vcpu_reset().

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-43-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:59 -04:00
Sean Christopherson
84ec8d2d53 KVM: VMX: Smush x2APIC MSR bitmap adjustments into single function
Consolidate all of the dynamic MSR bitmap adjustments into
vmx_update_msr_bitmap_x2apic(), and rename the mode tracker to reflect
that it is x2APIC specific.  If KVM gains more cases of dynamic MSR
pass-through, odds are very good that those new cases will be better off
with their own logic, e.g. see Intel PT MSRs and MSR_IA32_SPEC_CTRL.

Attempting to handle all updates in a common helper did more harm than
good, as KVM ended up collecting a large number of useless "updates".

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-42-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:58 -04:00
Sean Christopherson
e7c701dd7a KVM: VMX: Remove unnecessary initialization of msr_bitmap_mode
Don't bother initializing msr_bitmap_mode to 0, all of struct vcpu_vmx is
zero initialized.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-41-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:58 -04:00
Sean Christopherson
002f87a41e KVM: VMX: Don't redo x2APIC MSR bitmaps when userspace filter is changed
Drop an explicit call to update the x2APIC MSRs when the userspace MSR
filter is modified.  The x2APIC MSRs are deliberately exempt from
userspace filtering.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-40-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:58 -04:00
Sean Christopherson
284036c644 KVM: nVMX: Remove obsolete MSR bitmap refresh at nested transitions
Drop unnecessary MSR bitmap updates during nested transitions, as L1's
APIC_BASE MSR is not modified by the standard VM-Enter/VM-Exit flows,
and L2's MSR bitmap is managed separately.  In the unlikely event that L1
is pathological and loads APIC_BASE via the VM-Exit load list, KVM will
handle updating the bitmap in its normal WRMSR flows.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-39-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:58 -04:00
Sean Christopherson
9e4784e19d KVM: VMX: Remove obsolete MSR bitmap refresh at vCPU RESET/INIT
Remove an unnecessary MSR bitmap refresh during vCPU RESET/INIT.  In both
cases, the MSR bitmap already has the desired values and state.

At RESET, the vCPU is guaranteed to be running with x2APIC disabled, the
x2APIC MSRs are guaranteed to be intercepted due to the MSR bitmap being
initialized to all ones by alloc_loaded_vmcs(), and vmx->msr_bitmap_mode
is guaranteed to be zero, i.e. reflecting x2APIC disabled.

At INIT, the APIC_BASE MSR is not modified, thus there can't be any
change in x2APIC state.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-38-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:57 -04:00
Sean Christopherson
f39e805ee1 KVM: x86: Move setting of sregs during vCPU RESET/INIT to common x86
Move the setting of CR0, CR4, EFER, RFLAGS, and RIP from vendor code to
common x86.  VMX and SVM now have near-identical sequences, the only
difference being that VMX updates the exception bitmap.  Updating the
bitmap on SVM is unnecessary, but benign.  Unfortunately it can't be left
behind in VMX due to the need to update exception intercepts after the
control registers are set.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-37-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:57 -04:00
Sean Christopherson
c5c9f920f7 KVM: VMX: Don't _explicitly_ reconfigure user return MSRs on vCPU INIT
When emulating vCPU INIT, do not unconditionally refresh the list of user
return MSRs that need to be loaded into hardware when running the guest.
Unconditionally refreshing the list is confusing, as the vast majority of
MSRs are not modified on INIT.  The real motivation is to handle the case
where an INIT during long mode obviates the need to load the SYSCALL MSRs,
and that is handled as needed by vmx_set_efer().

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-36-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:57 -04:00
Sean Christopherson
432979b503 KVM: VMX: Refresh list of user return MSRs after setting guest CPUID
After a CPUID update, refresh the list of user return MSRs that are
loaded into hardware when running the vCPU.  This is necessary to handle
the oddball case where userspace exposes X86_FEATURE_RDTSCP to the guest
after the vCPU is running.

Fixes: 0023ef39dc ("kvm: vmx: Set IA32_TSC_AUX for legacy mode guests")
Fixes: 4e47c7a6d7 ("KVM: VMX: Add instruction rdtscp support for guest")
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-35-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:56 -04:00
Sean Christopherson
400dd54b37 KVM: VMX: Skip pointless MSR bitmap update when setting EFER
Split setup_msrs() into vmx_setup_uret_msrs() and an open coded refresh
of the MSR bitmap, and skip the latter when refreshing the user return
MSRs during an EFER load.  Only the x2APIC MSRs are dynamically exposed
and hidden, and those are not affected by a change in EFER.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-34-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:56 -04:00
Sean Christopherson
d0f9f826d8 KVM: SVM: Stuff save->dr6 at during VMSA sync, not at RESET/INIT
Move code to stuff vmcb->save.dr6 to its architectural init value from
svm_vcpu_reset() into sev_es_sync_vmsa().  Except for protected guests,
a.k.a. SEV-ES guests, vmcb->save.dr6 is set during VM-Enter, i.e. the
extra write is unnecessary.  For SEV-ES, stuffing save->dr6 handles a
theoretical case where the VMSA could be encrypted before the first
KVM_RUN.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-33-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:56 -04:00
Sean Christopherson
6cfe7b83ac KVM: SVM: Drop redundant writes to vmcb->save.cr4 at RESET/INIT
Drop direct writes to vmcb->save.cr4 during vCPU RESET/INIT, as the
values being written are fully redundant with respect to
svm_set_cr4(vcpu, 0) a few lines earlier.  Note, svm_set_cr4() also
correctly forces X86_CR4_PAE when NPT is disabled.

No functional change intended.

Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-32-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:56 -04:00
Sean Christopherson
ef8a0fa59b KVM: SVM: Tweak order of cr0/cr4/efer writes at RESET/INIT
Hoist svm_set_cr0() up in the sequence of register initialization during
vCPU RESET/INIT, purely to match VMX so that a future patch can move the
sequences to common x86.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-31-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:55 -04:00
Sean Christopherson
816be9e9be KVM: nVMX: Don't evaluate "emulation required" on nested VM-Exit
Use the "internal" variants of setting segment registers when stuffing
state on nested VM-Exit in order to skip the "emulation required"
updates.  VM-Exit must always go to protected mode, and all segments are
mostly hardcoded (to valid values) on VM-Exit.  The bits of the segments
that aren't hardcoded are explicitly checked during VM-Enter, e.g. the
selector RPLs must all be zero.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-30-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:55 -04:00
Sean Christopherson
1dd7a4f18f KVM: VMX: Skip emulation required checks during pmode/rmode transitions
Don't refresh "emulation required" when stuffing segments during
transitions to/from real mode when running without unrestricted guest.
The checks are unnecessary as vmx_set_cr0() unconditionally rechecks
"emulation required".  They also happen to be broken, as enter_pmode()
and enter_rmode() run with a stale vcpu->arch.cr0.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210713163324.627647-29-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-02 11:01:55 -04:00