Do mass update of debugfs.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Do mass update of cq.c to reuse newly introduced
mlx5_cmd_exec_in*() interfaces.
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
The KASLR code path in the arm64 version of the EFI stub incorporates
some overly complicated logic to randomly allocate a region of the right
alignment: there is no need to randomize the placement of the kernel
modulo 2 MiB separately from the placement of the 2 MiB aligned allocation
itself - we can simply follow the same logic used by the non-randomized
placement, which is to allocate at the correct alignment, and only take
TEXT_OFFSET into account if it is not a round multiple of the alignment.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
The notion of a 'preferred' load offset for the kernel dates back to the
times when the kernel's primary mapping overlapped with the linear region,
and memory below it could not be used at all.
Today, the arm64 kernel does not really care where it is loaded in physical
memory, as long as the alignment requirements are met, and so there is no
point in unconditionally moving the kernel to a new location in memory at
boot. Instead, we can
- check for a KASLR seed, and randomly reallocate the kernel if one is
provided
- otherwise, check whether the alignment requirements are met for the
current placement of the kernel, and just run it in place if they are
- finally, do an ordinary page allocation and reallocate the kernel to a
suitably aligned buffer anywhere in memory.
By the same reasoning, there is no need to take TEXT_OFFSET into account
if it is a round multiple of the minimum alignment, which is the usual
case for relocatable kernels with TEXT_OFFSET randomization disabled.
Otherwise, it suffices to use the relative misaligment of TEXT_OFFSET
when reallocating the kernel.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
The implementation of efi_random_alloc() arbitrarily truncates the
provided random seed to 16 bits, which limits the granularity of the
randomly chosen allocation offset in memory. This is currently only
an issue if the size of physical memory exceeds 128 GB, but going
forward, we will reduce the allocation alignment to 64 KB, and this
means we need to increase the granularity to ensure that the random
memory allocations are distributed evenly.
We will need to switch to 64-bit arithmetic for the multiplication,
but this does not result in 64-bit integer intrinsic calls on ARM or
on i386.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
The EFI stub uses a per-architecture #define for the minimum base
and size alignment of page allocations, which is set to 4 KB for
all architecures except arm64, which uses 64 KB, to ensure that
allocations can always be (un)mapped efficiently, regardless of
the page size used by the kernel proper, which could be a kexec'ee
The API wrappers around page based allocations assume that this
alignment is always taken into account, and so efi_free() will
also round up its size argument to EFI_ALLOC_ALIGN.
Currently, efi_random_alloc() does not honour this alignment for
the allocated size, and so freeing such an allocation may result
in unrelated memory to be freed, potentially leading to issues
after boot. So let's round up size in efi_random_alloc() as well.
Fixes: 2ddbfc81ea ("efi: stub: add implementation of efi_random_alloc()")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Add the ability to automatically pick the highest resolution video mode
(defined as the product of vertical and horizontal resolution) by using
a command-line argument of the form
video=efifb:auto
If there are multiple modes with the highest resolution, pick one with
the highest color depth.
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Link: https://lore.kernel.org/r/20200328160601.378299-2-nivedita@alum.mit.edu
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
pixel_format must be one of
PIXEL_RGB_RESERVED_8BIT_PER_COLOR
PIXEL_BGR_RESERVED_8BIT_PER_COLOR
PIXEL_BIT_MASK
since we skip PIXEL_BLT_ONLY when finding a gop.
Remove the redundant code and add another check in find_gop to skip any
pixel formats that we don't know about, in case a later version of the
UEFI spec adds one.
Reformat the code a little.
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Link: https://lore.kernel.org/r/20200320020028.1936003-10-nivedita@alum.mit.edu
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
We have wrappers around EFI calls so that x86 can define special
versions for mixed mode, while all other architectures can use the
same simple definition that just issues the call directly.
In preparation for the arrival of yet another architecture that doesn't
need anything special here (RISC-V), let's move the default definition
into a shared header.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Loading an initrd passed via the kernel command line is deprecated: it
is limited to files that reside in the same volume as the one the kernel
itself was loaded from, and we have more flexible ways to achieve the
same. So make it configurable so new architectures can decide not to
enable it.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Use follow_pfn() to get the PFN of a PFNMAP VMA instead of assuming that
vma->vm_pgoff holds the base PFN of the VMA. This fixes a bug where
attempting to do VFIO_IOMMU_MAP_DMA on an arbitrary PFNMAP'd region of
memory calculates garbage for the PFN.
Hilariously, this only got detected because the first "PFN" calculated
by vaddr_get_pfn() is PFN 0 (vma->vm_pgoff==0), and iommu_iova_to_phys()
uses PA==0 as an error, which triggers a WARN in vfio_unmap_unpin()
because the translation "failed". PFN 0 is now unconditionally reserved
on x86 in order to mitigate L1TF, which causes is_invalid_reserved_pfn()
to return true and in turns results in vaddr_get_pfn() returning success
for PFN 0. Eventually the bogus calculation runs into PFNs that aren't
reserved and leads to failure in vfio_pin_map_dma(). The subsequent
call to vfio_remove_dma() attempts to unmap PFN 0 and WARNs.
WARNING: CPU: 8 PID: 5130 at drivers/vfio/vfio_iommu_type1.c:750 vfio_unmap_unpin+0x2e1/0x310 [vfio_iommu_type1]
Modules linked in: vfio_pci vfio_virqfd vfio_iommu_type1 vfio ...
CPU: 8 PID: 5130 Comm: sgx Tainted: G W 5.6.0-rc5-705d787c7fee-vfio+ #3
Hardware name: Intel Corporation Mehlow UP Server Platform/Moss Beach Server, BIOS CNLSE2R1.D00.X119.B49.1803010910 03/01/2018
RIP: 0010:vfio_unmap_unpin+0x2e1/0x310 [vfio_iommu_type1]
Code: <0f> 0b 49 81 c5 00 10 00 00 e9 c5 fe ff ff bb 00 10 00 00 e9 3d fe
RSP: 0018:ffffbeb5039ebda8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff9a55cbf8d480 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff9a52b771c200
RBP: 0000000000000000 R08: 0000000000000040 R09: 00000000fffffff2
R10: 0000000000000001 R11: ffff9a51fa896000 R12: 0000000184010000
R13: 0000000184000000 R14: 0000000000010000 R15: ffff9a55cb66ea08
FS: 00007f15d3830b40(0000) GS:ffff9a55d5600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561cf39429e0 CR3: 000000084f75f005 CR4: 00000000003626e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
vfio_remove_dma+0x17/0x70 [vfio_iommu_type1]
vfio_iommu_type1_ioctl+0x9e3/0xa7b [vfio_iommu_type1]
ksys_ioctl+0x92/0xb0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x4c/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f15d04c75d7
Code: <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
Fixes: 73fa0d10d0 ("vfio: Type1 IOMMU implementation")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pull PCI fixes from Bjorn Helgaas:
- Workaround Apex TPU class code issue that prevents resource
assignment (Bjorn Helgaas)
- Update MAINTAINERS to add Rob Herring for native PCI controller
drivers (Lorenzo Pieralisi)
* tag 'pci-v5.7-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
MAINTAINERS: Add Rob Herring and remove Andy Murray as PCI reviewers
PCI: Move Apex Edge TPU class quirk to fix BAR assignment
sure, we'd done copy_from_user() on the same range, so we can
skip access_ok()... and it's not worth bothering. Just use
copy_to_user().
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
'count' is how much you want written, not the final position.
Moreover, it can legitimately be less than the current position...
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This reverts commit 6bb0942e8f.
Unfortunately it would appear that the rumors we've heard of sideband
message interleaving not being very well supported are true. On the
Lenovo ThinkPad Thunderbolt 3 dock that I have, interleaved messages
appear to just get dropped:
[drm:drm_dp_mst_wait_tx_reply [drm_kms_helper]] timedout msg send
00000000571ddfd0 2 1
[dp_mst] txmsg cur_offset=2 cur_len=2 seqno=1 state=SENT path_msg=1 dst=00
[dp_mst] type=ENUM_PATH_RESOURCES contents:
[dp_mst] port=2
DP descriptor for this hub:
OUI 90-cc-24 dev-ID SYNA3 HW-rev 1.0 SW-rev 3.12 quirks 0x0008
It would seem like as well that this is a somewhat well known issue in
the field. From section 5.4.2 of the DisplayPort 2.0 specification:
There are MST Sink/Branch devices in the field that do not handle
interleaved message transactions.
To facilitate message transaction handling by downstream devices, an
MST Source device shall generate message transactions in an atomic
manner (i.e., the MST Source device shall not concurrently interleave
multiple message transactions). Therefore, an MST Source device shall
clear the Message_Sequence_No value in the Sideband_MSG_Header to 0.
MST Source devices that support field policy updates by way of
software should update the policy to forego the generation of
interleaved message transactions.
This is a bit disappointing, as features like HDCP require that we send
a sideband request every ~2 seconds for each active stream. However,
there isn't really anything in the specification that allows us to
accurately probe for interleaved messages.
If it ends up being that we -really- need this in the future, we might
be able to whitelist hubs where interleaving is known to work-or maybe
try some sort of heuristics. But for now, let's just play it safe and
not use it.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 6bb0942e8f ("drm/dp_mst: Remove single tx msg restriction.")
Cc: Wayne Lin <Wayne.Lin@amd.com>
Cc: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200423164225.680178-1-lyude@redhat.com
Reviewed-by: Sean Paul <sean@poorly.run>
Pull ARM SoC fixes from Arnd Bergmann:
"A few smaller fixes for v5.7-rc3: The majority are fixes for bugs I
found after restarting my randconfig build testing that had been
dormant for a while.
On the Nokia N950/N9 phone, a DT fix is required to address a boot
regression.
For the bcm283x (Raspberry Pi), two DT fixes address minor issues"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
soc: imx8: select SOC_BUS
soc: tegra: fix tegra_pmc_get_suspend_mode definition
soc: fsl: dpio: avoid stack usage warning
soc: fsl: dpio: fix incorrect pointer conversions
ARM: imx: provide v7_cpu_resume() only on ARM_CPU_SUSPEND=y
ARM: dts: bcm283x: Disable dsi0 node
firmware: xilinx: make firmware_debugfs_root static
drivers: soc: xilinx: fix firmware driver Kconfig dependency
ARM: dts: bcm283x: Add cells encoding format to firmware bus
ARM: dts: OMAP3: disable RNG on N950/N9
Pull nfsd fixes from Chuck Lever:
"The first set of 5.7-rc fixes for NFS server issues.
These were all unresolved at the time the 5.7 window opened, and
needed some additional time to ensure they were correctly addressed.
They are ready now.
At the moment I know of one more urgent issue regarding the NFS
server. A fix has been tested and is under review. I expect to send
one more pull request, containing this fix (which now consists of 3
patches).
Fixes:
- Address several use-after-free and memory leak bugs
- Prevent a backchannel livelock"
* tag 'nfsd-5.7-rc-1' of git://git.linux-nfs.org/projects/cel/cel-2.6:
svcrdma: Fix leak of svc_rdma_recv_ctxt objects
svcrdma: Fix trace point use-after-free race
SUNRPC: Fix backchannel RPC soft lockups
SUNRPC/cache: Fix unsafe traverse caused double-free in cache_purge
nfsd: memory corruption in nfsd4_lock()
Pull exfat fixes from Namjae Jeon:
- several bug fixes(broken mount discard option, remount failure,
memory leak)
- add missing MODULE_ALIAS_FS for automatically loading exfat module.
- set s_time_gran and truncate atime with exfat timestamp granularity.
* tag 'for-5.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: truncate atimes to 2s granularity
exfat: properly set s_time_gran
exfat: remove 'bps' mount-option
exfat: Unify access to the boot sector
exfat: add missing MODULE_ALIAS_FS()
exfat: Fix discard support
Pull remoteproc fixes from Bjorn Andersson:
"This fixes a regression in the probe error path of the Qualcomm modem
remoteproc driver and a mix up of phy_addr_t and dma_addr_t in the
Mediatek SCP control driver"
* tag 'rproc-v5.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/andersson/remoteproc:
remoteproc: mtk_scp: use dma_addr_t for DMA API
remoteproc: qcom_q6v5_mss: fix q6v5_probe() error paths
remoteproc: qcom_q6v5_mss: fix a bug in q6v5_probe()
Pull audit fix from Paul Moore:
"One small audit patch fix, fixing a missing length check on input from
userspace, nothing crazy"
* tag 'audit-pr-20200422' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: check the length of userspace generated audit records
We must not call pinctrl_gpio_can_use_line() with the gpio_lock taken
as it takes a mutex internally. Let's move the call before taking the
spinlock and store the return value.
This isn't perfect - there's a moment between calling
pinctrl_gpio_can_use_line() and taking the spinlock where the situation
can change but it isn't a regression either: previously this part wasn't
protected at all and it only affects the information user-space is
seeing.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: d2ac257982 ("gpiolib: provide a dedicated function for setting lineinfo")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
This makes the new ioctl() a bit more robust - we now check if a line
is already being watched and return -EBUSY if the user-space tries to
start watching it again. Same for unwatch - return -EBUSY if user-space
tries to unwatch a line that's not being watched.
Fixes: 51c1064e82 ("gpiolib: add new ioctl() for monitoring changes in line info")
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
pca953x_gpio_set_config is setup to support pull-up/down
bias. Currently the driver uses a variable called 'config' to
determine which options to use. Unfortunately, this is incorrect.
This patch uses function pinconf_to_config_param(config), which
converts this 'config' parameter back to pinconfig to determine
which option to use.
Fixes: 15add06841 ("gpio: pca953x: add ->set_config implementation")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Move all zoned mode related code from null_blk_main.c to
null_blk_zoned.c, avoiding an ugly #ifdef in the process.
Rename null_zone_init() into null_init_zoned_dev(), null_zone_exit()
into null_free_zoned_dev() and add the new function
null_register_zoned_dev() to finalize the zoned dev setup before
add_disk().
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For write operations issued to a null_blk device with zoned mode
enabled, the state and write pointer position of the zone targeted by
the command should be checked before badblocks and memory backing
are handled as the write may be first failed due to, for instance, a
sector position not aligned with the zone write pointer. This order of
checking for errors reflects more accuratly the behavior of physical
zoned devices.
Furthermore, the write pointer position of the target zone should be
incremented only and only if no errors are reported by badblocks and
memory backing handling.
To fix this, introduce the small helper function null_process_cmd()
which execute null_handle_badblocks() and null_handle_memory_backed()
and use this function in null_zone_write() to correctly handle write
requests to zoned null devices depending on the type and state of the
write target zone. Also call this function in null_handle_zoned() to
process read requests to zoned null devices.
null_process_cmd() is called directly from null_handle_cmd() for
regular null devices, resulting in no functional change for these type
of devices. To have symmetric names, the function null_handle_zoned()
is renamed to null_process_zoned_cmd().
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It's likely that the vcpu fails to handle all virtual interrupts if
userspace decides to destroy it, leaving the pending ones stay in the
ap_list. If the un-handled one is a LPI, its vgic_irq structure will
be eventually leaked because of an extra refcount increment in
vgic_queue_irq_unlock().
This was detected by kmemleak on almost every guest destroy, the
backtrace is as follows:
unreferenced object 0xffff80725aed5500 (size 128):
comm "CPU 5/KVM", pid 40711, jiffies 4298024754 (age 166366.512s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 08 01 a9 73 6d 80 ff ff ...........sm...
c8 61 ee a9 00 20 ff ff 28 1e 55 81 6c 80 ff ff .a... ..(.U.l...
backtrace:
[<000000004bcaa122>] kmem_cache_alloc_trace+0x2dc/0x418
[<0000000069c7dabb>] vgic_add_lpi+0x88/0x418
[<00000000bfefd5c5>] vgic_its_cmd_handle_mapi+0x4dc/0x588
[<00000000cf993975>] vgic_its_process_commands.part.5+0x484/0x1198
[<000000004bd3f8e3>] vgic_its_process_commands+0x50/0x80
[<00000000b9a65b2b>] vgic_mmio_write_its_cwriter+0xac/0x108
[<0000000009641ebb>] dispatch_mmio_write+0xd0/0x188
[<000000008f79d288>] __kvm_io_bus_write+0x134/0x240
[<00000000882f39ac>] kvm_io_bus_write+0xe0/0x150
[<0000000078197602>] io_mem_abort+0x484/0x7b8
[<0000000060954e3c>] kvm_handle_guest_abort+0x4cc/0xa58
[<00000000e0d0cd65>] handle_exit+0x24c/0x770
[<00000000b44a7fad>] kvm_arch_vcpu_ioctl_run+0x460/0x1988
[<0000000025fb897c>] kvm_vcpu_ioctl+0x4f8/0xee0
[<000000003271e317>] do_vfs_ioctl+0x160/0xcd8
[<00000000e7f39607>] ksys_ioctl+0x98/0xd8
Fix it by retiring all pending LPIs in the ap_list on the destroy path.
p.s. I can also reproduce it on a normal guest shutdown. It is because
userspace still send LPIs to vcpu (through KVM_SIGNAL_MSI ioctl) while
the guest is being shutdown and unable to handle it. A little strange
though and haven't dig further...
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
[maz: moved the distributor deallocation down to avoid an UAF splat]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200414030349.625-2-yuzenghui@huawei.com