The Lenovo ThinkPad W530 uses a nvidia k1000m GPU. When this gets used
together with one of the older nvidia binary driver series (the latest
series does not support it), then backlight control does not work.
This is caused by commit 3dbc80a3e4 ("ACPI: video: Make backlight
class device registration a separate step (v2)") combined with
commit 5aa9d943e9 ("ACPI: video: Don't enable fallback path for
creating ACPI backlight by default").
After these changes the acpi_video# backlight device is only registered
when requested by a GPU driver calling acpi_video_register_backlight()
which the nvidia binary driver does not do.
I realize that using the nvidia binary driver is not a supported use-case
and users can workaround this by adding acpi_backlight=video on the kernel
commandline, but the ThinkPad W530 is a popular model under Linux users,
so it seems worthwhile to add a quirk for this.
I will also email Nvidia asking them to make the driver call
acpi_video_register_backlight() when an internal LCD panel is detected.
So maybe the next maintenance release of the drivers will fix this...
Fixes: 5aa9d943e9 ("ACPI: video: Don't enable fallback path for creating ACPI backlight by default")
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
On the Apple iMac14,1 and iMac14,2 all-in-ones (monitors with builtin "PC")
the connection between the GPU and the panel is seen by the GPU driver as
regular DP instead of eDP, causing the GPU driver to never call
acpi_video_register_backlight().
(GPU drivers only call acpi_video_register_backlight() when an internal
panel is detected, to avoid non working acpi_video# devices getting
registered on desktops which unfortunately is a real issue.)
Fix the missing acpi_video# backlight device on these all-in-ones by
adding a acpi_backlight=video DMI quirk, so that video.ko will
immediately register the backlight device instead of waiting for
an acpi_video_register_backlight() call.
Fixes: 5aa9d943e9 ("ACPI: video: Don't enable fallback path for creating ACPI backlight by default")
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 3dbc80a3e4 ("ACPI: video: Make backlight class device
registration a separate step (v2)") combined with
commit 5aa9d943e9 ("ACPI: video: Don't enable fallback path for
creating ACPI backlight by default")
Means that the video.ko code now fully depends on the GPU driver calling
acpi_video_register_backlight() for the acpi_video# backlight class
devices to get registered.
This means that if the GPU driver does not do this, acpi_backlight=video
on the cmdline, or DMI quirks for selecting acpi_video# will not work.
This is a problem on for example Apple iMac14,1 all-in-ones where
the monitor's LCD panel shows up as a regular DP connection instead of
eDP so the GPU driver will not call acpi_video_register_backlight() [1].
Fix this by making video.ko directly register the acpi_video# devices
when these have been explicitly requested either on the cmdline or
through DMI quirks (rather then auto-detection being used).
[1] GPU drivers only call acpi_video_register_backlight() when an internal
panel is detected, to avoid non working acpi_video# devices getting
registered on desktops which unfortunately is a real issue.
Fixes: 5aa9d943e9 ("ACPI: video: Don't enable fallback path for creating ACPI backlight by default")
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Allow callers of __acpi_video_get_backlight_type() to pass a pointer
to a bool which will get set to false if the backlight-type comes from
the cmdline or a DMI quirk and set to true if auto-detection was used.
And make __acpi_video_get_backlight_type() non static so that it can
be called directly outside of video_detect.c .
While at it turn the acpi_video_get_backlight_type() and
acpi_video_backlight_use_native() wrappers into static inline functions
in include/acpi/video.h, so that we need to export one less symbol.
Fixes: 5aa9d943e9 ("ACPI: video: Don't enable fallback path for creating ACPI backlight by default")
Cc: All applicable <stable@vger.kernel.org>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After a server reboot, clients are failing to move files with ENOENT.
This is caused by DFS referrals containing multiple separators, which
the server move call doesn't recognize.
v1: Initial patch.
v2: Move prototype to header.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2182472
Fixes: a31080899d ("cifs: sanitize multiple delimiters in prepath")
Actually-Fixes: 24e0a1eff9 ("cifs: switch to new mount api")
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Thiago Rafael Becker <tbecker@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
iov_iter for ep_read_iter can be ITER_UBUF with io_uring.
In that case dup_iter() does not have to allocate iov and it can
return NULL. Fix the assumption by checking for iter_is_ubuf()
other wise ep_read_iter can treat this as failure and return -ENOMEM.
Fixes: 1e23db450c ("io_uring: use iter_ubuf for single range imports")
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20230401060509.3608259-3-dhavale@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iov_iter for ffs_epfile_read_iter can be ITER_UBUF with io_uring.
In that case dup_iter() does not have to allocate anything and it
can return NULL. ffs_epfile_read_iter treats this as a failure and
returns -ENOMEM. Fix it by checking if iter_is_ubuf().
Fixes: 1e23db450c ("io_uring: use iter_ubuf for single range imports")
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20230401060509.3608259-2-dhavale@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While determining the initial pin assignment to be sent in the configure
message, using the DP_PIN_ASSIGN_DP_ONLY_MASK mask causes the DFP_U to
send both Pin Assignment C and E when both are supported by the DFP_U and
UFP_U. The spec (Table 5-7 DFP_U Pin Assignment Selection Mandates,
VESA DisplayPort Alt Mode Standard v2.0) indicates that the DFP_U never
selects Pin Assignment E when Pin Assignment C is offered.
Update the DP_PIN_ASSIGN_DP_ONLY_MASK conditional to intially select only
Pin Assignment C if it is available.
Fixes: 0e3bb7d689 ("usb: typec: Add driver for DisplayPort alternate mode")
Cc: stable@vger.kernel.org
Signed-off-by: RD Babiera <rdbabiera@google.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20230329215159.2046932-1-rdbabiera@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan reports that smatch complains about a potential uninitialized
variable being used in the compat alignment fixup code.
The logic is not wrong per se, but we do end up using an uninitialized
variable if reading the instruction that triggered the alignment fault
from user space faults, even if the fault ensures that the uninitialized
value doesn't propagate any further.
Given that we just give up and return 1 if any fault occurs when reading
the instruction, let's get rid of the 'success handling' pattern that
captures the fault in a variable and aborts later, and instead, just
return 1 immediately if any of the get_user() calls result in an
exception.
Fixes: 3fc24ef32d ("arm64: compat: Implement misalignment fixups for multiword loads")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/r/202304021214.gekJ8yRc-lkp@intel.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230404103625.2386382-1-ardb@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
- Fix timerlat notification, as it was not triggering the
notify to users when a new max latency was hit.
- Do not trigger max latency if the tracing is off.
When tracing is off, the ring buffer is not updated, it
does not make sense to notify when there's a new max latency
detected by the tracer, as why that latency happened is not available.
The tracing logic still runs when the ring buffer is disabled, but
it should not be triggering notifications.
- Fix race on freeing the synthetic event "last_cmd" variable by
adding a mutex around it.
- Fix race between reader and writer of the ring buffer by adding
memory barriers. When the writer is still on the reader page
it must have its content visible on the buffer before it moves
the commit index that the reader uses to know how much content is
on the page.
- Make get_lock_parent_ip() always inlined, as it uses _THIS_IP_
and _RET_IP_, which gets broken if it is not inlined.
- Make __field(int, arr[5]) in a TRACE_EVENT() macro fail to build.
The field formats of trace events are calculated by using sizeof(type)
and other means by what is passed into the structure macros like
__field(). The __field() macro is only meant for atom types like
int, long, short, pointer, etc. It is not meant for arrays. But
the code will currently compile with arrays, but then the format
produced will be inaccurate, and user space parsing tools will break.
Two bugs have already been fixed, now add code that will make the
kernel fail to build if another trace event includes this buggy
field format.
- Fix boot up snapshot code:
Boot snapshots were triggering when not even asked for on the
kernel command line. This was caused by two bugs.
1) It would trigger a snapshot on any instance if one was created
from the kernel command line.
2) The error handling would only affect the top level instance.
So the fact that a snapshot was done on a instance that didn't
allocate a buffer triggered a warning written into the top level
buffer, and worse yet, disabled the top level buffer.
- Fix memory leak that was caused when an error was logged in a
trace buffer instance, and then the buffer instance was removed.
The allocated error log messages still need to be freed.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZC2GEBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qsXuAP4xHoYnHMfixLZz9Z4KSz/2LWTjBce8
GItuEmfWXANQ7wD7BS0li8/ISCDg0P/epuWFOHyxP3jpbfPZtpYfrj5Cog4=
=71kh
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix timerlat notification, as it was not triggering the notify to
users when a new max latency was hit.
- Do not trigger max latency if the tracing is off.
When tracing is off, the ring buffer is not updated, it does not make
sense to notify when there's a new max latency detected by the
tracer, as why that latency happened is not available. The tracing
logic still runs when the ring buffer is disabled, but it should not
be triggering notifications.
- Fix race on freeing the synthetic event "last_cmd" variable by adding
a mutex around it.
- Fix race between reader and writer of the ring buffer by adding
memory barriers. When the writer is still on the reader page it must
have its content visible on the buffer before it moves the commit
index that the reader uses to know how much content is on the page.
- Make get_lock_parent_ip() always inlined, as it uses _THIS_IP_ and
_RET_IP_, which gets broken if it is not inlined.
- Make __field(int, arr[5]) in a TRACE_EVENT() macro fail to build.
The field formats of trace events are calculated by using
sizeof(type) and other means by what is passed into the structure
macros like __field(). The __field() macro is only meant for atom
types like int, long, short, pointer, etc. It is not meant for
arrays.
The code will currently compile with arrays, but then the format
produced will be inaccurate, and user space parsing tools will break.
Two bugs have already been fixed, now add code that will make the
kernel fail to build if another trace event includes this buggy field
format.
- Fix boot up snapshot code:
Boot snapshots were triggering when not even asked for on the kernel
command line. This was caused by two bugs:
1) It would trigger a snapshot on any instance if one was created
from the kernel command line.
2) The error handling would only affect the top level instance.
So the fact that a snapshot was done on a instance that didn't
allocate a buffer triggered a warning written into the top level
buffer, and worse yet, disabled the top level buffer.
- Fix memory leak that was caused when an error was logged in a trace
buffer instance, and then the buffer instance was removed.
The allocated error log messages still needed to be freed.
* tag 'trace-v6.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Free error logs of tracing instances
tracing: Fix ftrace_boot_snapshot command line logic
tracing: Have tracing_snapshot_instance_cond() write errors to the appropriate instance
tracing: Error if a trace event has an array for a __field()
tracing/osnoise: Fix notify new tracing_max_latency
tracing/timerlat: Notify new max thread latency
ftrace: Mark get_lock_parent_ip() __always_inline
ring-buffer: Fix race while reader and writer are on the same page
tracing/synthetic: Fix races on freeing last_cmd
The device can report discard support without setting the ONCS DSM bit.
When not set, the driver clears max_discard_size expecting it to be set
later. We don't know the size until we have the namespace format,
though, so setting it is deferred until configuring one, but the driver
was abandoning the discard settings due to that initial clearing.
Move the max_discard_size calculation above the check for a '0' discard
size.
Fixes: 1a86924e4f ("nvme: fix interpretation of DMRSL")
Reported-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
drm/i915 fixes for v6.3-rc6:
- Fix DP MST DSC M/N calculation to use compressed bpp
- Fix racy use-after-free in perf ioctl
- Fix context runtime accounting
- Fix handling of GT reset during HuC loading
- Fix use of unsigned vm_fault_t for error values
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87zg7mzomz.fsf@intel.com
In the j1939_tp_tx_dat_new() function, an out-of-bounds memory access
could occur during the memcpy() operation if the size of skb->cb is
larger than the size of struct j1939_sk_buff_cb. This is because the
memcpy() operation uses the size of skb->cb, leading to a read beyond
the struct j1939_sk_buff_cb.
Updated the memcpy() operation to use the size of struct
j1939_sk_buff_cb instead of the size of skb->cb. This ensures that the
memcpy() operation only reads the memory within the bounds of struct
j1939_sk_buff_cb, preventing out-of-bounds memory access.
Additionally, add a BUILD_BUG_ON() to check that the size of skb->cb
is greater than or equal to the size of struct j1939_sk_buff_cb. This
ensures that the skb->cb buffer is large enough to hold the
j1939_sk_buff_cb structure.
Fixes: 9d71dd0c70 ("can: add support of SAE J1939 protocol")
Reported-by: Shuangpeng Bai <sjb7183@psu.edu>
Tested-by: Shuangpeng Bai <sjb7183@psu.edu>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://groups.google.com/g/syzkaller/c/G_LL-C3plRs/m/-8xCi6dCAgAJ
Link: https://lore.kernel.org/all/20230404073128.3173900-1-o.rempel@pengutronix.de
Cc: stable@vger.kernel.org
[mkl: rephrase commit message]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The same task check in perf_event_set_output has some potential issues
for some usages.
For the current perf code, there is a problem if using of
perf_event_open() to have multiple samples getting into the same mmap’d
memory when they are both attached to the same process.
https://lore.kernel.org/all/92645262-D319-4068-9C44-2409EF44888E@gmail.com/
Because the event->ctx is not ready when the perf_event_set_output() is
invoked in the perf_event_open().
Besides the above issue, before the commit bd27568117 ("perf: Rewrite
core context handling"), perf record can errors out when sampling with
a hardware event and a software event as below.
$ perf record -e cycles,dummy --per-thread ls
failed to mmap with 22 (Invalid argument)
That's because that prior to the commit a hardware event and a software
event are from different task context.
The problem should be a long time issue since commit c3f00c7027
("perk: Separate find_get_context() from event initialization").
The task struct is stored in the event->hw.target for each per-thread
event. It is a more reliable way to determine whether two events are
attached to the same task.
The event->hw.target was also introduced several years ago by the
commit 50f16a8bf9 ("perf: Remove type specific target pointers"). It
can not only be used to fix the issue with the current code, but also
back port to fix the issues with an older kernel.
Note: The event->hw.target was introduced later than commit
c3f00c7027. The patch may cannot be applied between the commit
c3f00c7027 and commit 50f16a8bf9. Anybody that wants to back-port
this at that period may have to find other solutions.
Fixes: c3f00c7027 ("perf: Separate find_get_context() from event initialization")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lkml.kernel.org/r/20230322202449.512091-1-kan.liang@linux.intel.com
Thomas reported that offlining CPUs spends a lot of time in
synchronize_rcu() as called from perf_pmu_migrate_context() even though
he's not actually using uncore events.
Turns out, the thing is unconditionally waiting for RCU, even if there's
no actual events to migrate.
Fixes: 0cda4c0231 ("perf: Introduce perf_pmu_migrate_context()")
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lkml.kernel.org/r/20230403090858.GT4253@hirez.programming.kicks-ass.net
Wait for VPU to be idle in ivpu_pm_suspend_cb() before powering off
the device, so jobs are not lost and TDRs are not triggered after
resume.
Fixes: 852be13f3b ("accel/ivpu: Add PM support")
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230331113603.2802515-3-stanislaw.gruszka@linux.intel.com
Currently job->done_fence is added to every BO handle within a job. If job
handle (command buffer) is shared between multiple submits, KMD will add
the fence in each of them. Then bo_wait_ioctl() executed on command buffer
will exit only when all jobs containing that handle are done.
This creates deadlock scenario for user mode driver in case when job handle
is added as dependency of another job, because bo_wait_ioctl() of first job
will wait until second job finishes, and second job can not finish before
first one.
Having fences added only to job buffer handle allows user space to execute
bo_wait_ioctl() on the job even if it's handle is submitted with other job.
Fixes: cd7272215c ("accel/ivpu: Add command buffer submission logic")
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230331113603.2802515-2-stanislaw.gruszka@linux.intel.com
The kernel command line ftrace_boot_snapshot by itself is supposed to
trigger a snapshot at the end of boot up of the main top level trace
buffer. A ftrace_boot_snapshot=foo will do the same for an instance called
foo that was created by trace_instance=foo,...
The logic was broken where if ftrace_boot_snapshot was by itself, it would
trigger a snapshot for all instances that had tracing enabled, regardless
if it asked for a snapshot or not.
When a snapshot is requested for a buffer, the buffer's
tr->allocated_snapshot is set to true. Use that to know if a trace buffer
wants a snapshot at boot up or not.
Since the top level buffer is part of the ftrace_trace_arrays list,
there's no reason to treat it differently than the other buffers. Just
iterate the list if ftrace_boot_snapshot was specified.
Link: https://lkml.kernel.org/r/20230405022341.895334039@goodmis.org
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <zwisler@google.com>
Fixes: 9c1c251d67 ("tracing: Allow boot instances to have snapshot buffers")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
If a trace instance has a failure with its snapshot code, the error
message is to be written to that instance's buffer. But currently, the
message is written to the top level buffer. Worse yet, it may also disable
the top level buffer and not the instance that had the issue.
Link: https://lkml.kernel.org/r/20230405022341.688730321@goodmis.org
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <zwisler@google.com>
Fixes: 2824f50332 ("tracing: Make the snapshot trigger work with instances")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Non-GSO TCP packets whose SKBs' linear portion did not include the
entire TCP header were not populating the first Tx descriptor with
as many bytes as the vNIC expected. This change ensures that all
TCP packets populate the first descriptor with the correct number of
bytes.
Fixes: 893ce44df5 ("gve: Add basic driver framework for Compute Engine Virtual NIC")
Signed-off-by: Shailend Chand <shailend@google.com>
Link: https://lore.kernel.org/r/20230403172809.2939306-1-shailend@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If the number of lanes was forced and then subsequently the user
omits this parameter, the ksettings->lanes is reset. The driver
should then reset the number of lanes to the device's default
for the specified speed.
However, although the ksettings->lanes is set to 0, the mod variable
is not set to true to indicate the driver and userspace should be
notified of the changes.
The consequence is that the same ethtool operation will produce
different results based on the initial state.
If the initial state is:
$ ethtool swp1 | grep -A 3 'Speed: '
Speed: 500000Mb/s
Lanes: 2
Duplex: Full
Auto-negotiation: on
then executing 'ethtool -s swp1 speed 50000 autoneg off' will yield:
$ ethtool swp1 | grep -A 3 'Speed: '
Speed: 500000Mb/s
Lanes: 2
Duplex: Full
Auto-negotiation: off
While if the initial state is:
$ ethtool swp1 | grep -A 3 'Speed: '
Speed: 500000Mb/s
Lanes: 1
Duplex: Full
Auto-negotiation: off
executing the same 'ethtool -s swp1 speed 50000 autoneg off' results in:
$ ethtool swp1 | grep -A 3 'Speed: '
Speed: 500000Mb/s
Lanes: 1
Duplex: Full
Auto-negotiation: off
This patch fixes this behavior. Omitting lanes will always results in
the driver choosing the default lane width for the chosen speed. In this
scenario, regardless of the initial state, the end state will be, e.g.,
$ ethtool swp1 | grep -A 3 'Speed: '
Speed: 500000Mb/s
Lanes: 2
Duplex: Full
Auto-negotiation: off
Fixes: 012ce4dd31 ("ethtool: Extend link modes settings uAPI with lanes")
Signed-off-by: Andy Roulin <aroulin@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/ac238d6b-8726-8156-3810-6471291dbc7f@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima says:
====================
raw/ping: Fix locking in /proc/net/{raw,icmp}.
The first patch fixes a NULL deref for /proc/net/raw and second one fixes
the same issue for ping sockets.
The first patch also converts hlist_nulls to hlist, but this is because
the current code uses sk_nulls_for_each() for lockless readers, instead
of sk_nulls_for_each_rcu() which adds memory barrier, but raw sockets
does not use the nulls marker nor SLAB_TYPESAFE_BY_RCU in the first place.
OTOH, the ping sockets already uses sk_nulls_for_each_rcu(), and such
conversion can be posted later for net-next.
====================
Link: https://lore.kernel.org/r/20230403194959.48928-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After commit dbca1596bb ("ping: convert to RCU lookups, get rid
of rwlock"), we use RCU for ping sockets, but we should use spinlock
for /proc/net/icmp to avoid a potential NULL deref mentioned in
the previous patch.
Let's go back to using spinlock there.
Note we can convert ping sockets to use hlist instead of hlist_nulls
because we do not use SLAB_TYPESAFE_BY_RCU for ping sockets.
Fixes: dbca1596bb ("ping: convert to RCU lookups, get rid of rwlock")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
One motivation for mapping range registers to decoder objects is
to use those settings for region autodiscovery.
The need to map a region for devices programmed to use range registers
is especially urgent now that the kernel no longer routes "Soft
Reserved" ranges in the memory map to device-dax by default. The CXL
memory range loses all access mechanisms.
Complete the implementation by marking the DPA reservation and setting
the endpoint-decoder state to signal autodiscovery. Note that the
default settings of ways=1 and granularity=4096 set in cxl_decode_init()
do not need to be updated.
Fixes: 09d09e04d2 ("cxl/dax: Create dax devices for CXL RAM regions")
Tested-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Gregory Price <gregory.price@memverge.com>
Link: https://lore.kernel.org/r/168012575521.221280.14177293493678527326.stgit@dwillia2-xfh.jf.intel.com
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Recall that range register emulation seeks to treat the 2 potential
range registers as Linux CXL "decoder" objects. The number of range
registers can be 1 or 2, while HDM decoder ranges can include more than
2.
Be careful not to confuse DVSEC range count with HDM capability decoder
count. Commit to range register earlier in devm_cxl_setup_hdm().
Otherwise, a device with more HDM decoders than range registers can set
@cxlhdm->decoder_count to an invalid value.
Avoid introducing a forward declaration by just moving the definition of
should_emulate_decoders() earlier in the file. should_emulate_decoders()
is unchanged.
Tested-by: Dave Jiang <dave.jiang@intel.com>
Fixes: d7a2153762 ("cxl/hdm: Add emulation when HDM decoders are not committed")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168012574932.221280.15944705098679646436.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Each time the contents of a given HPA are potentially changed in a cache
incoherent manner the CXL core sets CXL_REGION_F_INCOHERENT to
invalidate CPU caches before the region is used.
Successful invocation of attach_target() indicates that DPA has been
newly assigned to a given HPA in the dynamic region creation flow.
However, attach_target() is also reused in the autodiscovery flow where
the region was activated by platform firmware. In that case there is no
need to invalidate caches because that region is already in active use
and nothing about the autodiscovery flow modifies the HPA-to-DPA
relationship.
In the autodiscovery case cxl_region_attach() exits early after
determining the endpoint decoder is already correctly attached to the
region.
Fixes: a32320b71f ("cxl/region: Add region autodiscovery")
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168002858817.50647.1217607907088920888.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
RCDs (CXL memory devices that link train without VH capability and show
up as root complex integrated endpoints), hide the presence of the link
between the endpoint and the host-bridge. The CXL region setup/teardown
paths assume that a link hop is present and go looking for at least one
'struct cxl_port' instance between the CXL root port-object and an
endpoint port-object leading to crashes of the form:
BUG: kernel NULL pointer dereference, address: 0000000000000008
[..]
RIP: 0010:cxl_region_setup_targets+0x3e9/0xae0 [cxl_core]
[..]
Call Trace:
<TASK>
cxl_region_attach+0x46c/0x7a0 [cxl_core]
cxl_create_region+0x20b/0x270 [cxl_core]
cxl_mock_mem_probe+0x641/0x800 [cxl_mock_mem]
platform_probe+0x5b/0xb0
Detect RCDs explicitly and skip walking the non-existent port hierarchy
between root and endpoint in that case.
While this has been a problem since:
commit 0a19bfc8de ("cxl/port: Add RCD endpoint port enumeration")
...it becomes a more reliable crash scenario with the new autodiscovery
implementation.
Fixes: a32320b71f ("cxl/region: Add region autodiscovery")
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168002858268.50647.728091521032131326.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The find_cxl_root() helper is used to lookup root decoders and other CXL
platform topology information for a given endpoint. It turns out that
for RCDs it has never worked. The result of find_cxl_root(&cxlmd->dev)
is always NULL for the RCH topology case because it expects to find a
cxl_port at the host-bridge. RCH topologies only have the root cxl_port
object with the host-bridge as a dport. While there are no reports of
this being a problem to date, by inspection region enumeration should
crash as a result of this problem, and it does in a local unit test for
this scenario.
However, an observation that ever since:
commit f17b558d66 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue")
...all callers of find_cxl_root() occur after the memdev connection to
the port topology has been established. That means that find_cxl_root()
can be simplified to a walk of the endpoint port topology to the root.
Switch to that arrangement which also fixes the RCD bug.
Fixes: a32320b71f ("cxl/region: Add region autodiscovery")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168002857715.50647.344876437247313909.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If the driver is allowed to enable memory operation itself then it can
also turn on HDM decoder support at will.
With this the second call to cxl_setup_hdm_decoder_from_dvsec(), when
an HDM decoder is not committed, is not needed.
Fixes: b777e9bec9 ("cxl/hdm: Emulate HDM decoder from DVSEC range registers")
Link: http://lore.kernel.org/r/20230220113657.000042e1@huawei.com
Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167703068474.185722.664126485486344246.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Polling needs a bio with a valid bi_bdev, but neither of those are
guaranteed for polled driver requests. Make request based polling
directly use blk-mq's polling function instead.
When executing a request from a polled hctx, we know the request's
cookie, and that it's from a live blk-mq queue that supports polling, so
we can safely skip everything that bio_poll provides.
Cc: stable@kernel.org
Reported-by: Martin Belanger <Martin.Belanger@dell.com>
Reported-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Tested-by: Daniel Wagner <dwagner@suse.de>
Revieded-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230331180056.1155862-1-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Hide KVM_CAP_IRQFD_RESAMPLE if XIVE is enabled
s390:
* Fix handling of external interrupts in protected guests
x86:
* Resample the pending state of IOAPIC interrupts when unmasking them
* Fix usage of Hyper-V "enlightened TLB" on AMD
* Small fixes to real mode exceptions
* Suppress pending MMIO write exits if emulator detects exception
Documentation:
* Fix rST syntax
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmQsXMQUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroOc6Qf/YMDcYxpPuYFZ69t9k4s0ubeGjGpr
Qdzo6ZgKbfs4+nu89u2F5zvAftRSrEVQRpfLwmTXauM8el3s+EAEPlK/MLH6sSSo
LG3eWoNxs7MUQ6gBWeQypY6opvz3W9hf8KYd+GY6TV7jwV5/YgH6674Yyd7T/bCm
0/tUiil5Q7Da0w4GUY5m9gEqlbA9yjDBFySqqTZemo18NzLaPNpesKnm1hvv0QJG
JRvOMmHwqAZgS0G5cyGEPZKY6Ga0v9QsAXc8mFC/Jtn9GMD2f47PYDRfd/7NSiVz
ulgsk8zZkZIZ6nMbQ1t10txmGIbFmJ4iPPaATrMII/xHq+idgAlHU7zY1A==
=QvFj
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"PPC:
- Hide KVM_CAP_IRQFD_RESAMPLE if XIVE is enabled
s390:
- Fix handling of external interrupts in protected guests
x86:
- Resample the pending state of IOAPIC interrupts when unmasking them
- Fix usage of Hyper-V "enlightened TLB" on AMD
- Small fixes to real mode exceptions
- Suppress pending MMIO write exits if emulator detects exception
Documentation:
- Fix rST syntax"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
docs: kvm: x86: Fix broken field list
KVM: PPC: Make KVM_CAP_IRQFD_RESAMPLE platform dependent
KVM: s390: pv: fix external interruption loop not always detected
KVM: nVMX: Do not report error code when synthesizing VM-Exit from Real Mode
KVM: x86: Clear "has_error_code", not "error_code", for RM exception injection
KVM: x86: Suppress pending MMIO write exits if emulator detects exception
KVM: x86/ioapic: Resample the pending state of an IRQ when unmasking
KVM: irqfd: Make resampler_list an RCU list
KVM: SVM: Flush Hyper-V TLB when required
- Fix a crash and a resource leak in NFSv4 COMPOUND processing
- Fix issues with AUTH_SYS credential handling
- Try again to address an NFS/NFSD/SUNRPC build dependency regression
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmQsLHMACgkQM2qzM29m
f5eNAA/8DokMQLQ+gN8zhQNqmw92sUdW3m41o0/DfETVXyE60sX8uOE7PktSGwfz
fWMMiQpvmnmw/lbO84XQ9i8E0hjh8cT26l1CJum4VgSFiBJJvIqNxb0Yro43R4Jc
1wU2AOpC9qzCokdHhHKszDXuOgsz3v5OMJSQz3mG50dlq8/+6KKrCnK6jakyrvxr
vKcVMsoENhxh2MnJfbsIQ70UM/rF6dmZWzGuBJH51Fkt+0FD9cnxXZKCkv1+D8JN
5Hr+8rv4I/VqBGDzv9QHoEowZr70e8UUi2UME/jwSArhdxwsfPEqV/qWwHq9Q133
RW40Gco7/E3JUjpAVTRXVGSB+LwU1EvhWQ9qSpSx5D2CPAHJ9hsOw4I54+Q0vD+j
2druOpqIITZOvI3K54ZJXa2LK6SpZ8NnncP5YkLWOwR0Wqohy1U8Sm5uOiMs+IJa
neTxL7f+u3MDQgaDCTuBkI4oKzSDF/ZiMTWh52iPyy9x03SRYXbW6UgqDiySIg0P
jvvaDFCvKvvL2qEmksMoQbWxSjVj8PqL+qJIxQNIZwHbows6paL+l0rdSPXc+l2O
97GBlqNPfHt+AjfJvGDscaIcLA+gu+ErzwxG6BLKvB9QcX9/F3A62Nh3txpe5Q1r
M5NyQwK3vVQcTejMqw34sBqp3EeI5iIJ9CjD/2tN+dUpeHyQld8=
=bk6S
-----END PGP SIGNATURE-----
Merge tag 'nfsd-6.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Fix a crash and a resource leak in NFSv4 COMPOUND processing
- Fix issues with AUTH_SYS credential handling
- Try again to address an NFS/NFSD/SUNRPC build dependency regression
* tag 'nfsd-6.3-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: callback request does not use correct credential for AUTH_SYS
NFS: Remove "select RPCSEC_GSS_KRB5
sunrpc: only free unix grouplist after RCU settles
nfsd: call op_release, even when op_func returns an error
NFSD: Avoid calling OPDESC() with ops->opnum == OP_ILLEGAL
Add a missing ":" to fix a broken field list.
Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
Fixes: ba7bb663f5 ("KVM: x86: Provide per VM capability for disabling PMU virtualization")
Message-Id: <20230331093116.99820-1-itazur@amazon.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Code that passes a 32-bit constant into cmpxchg() produces a harmless
sparse warning because of the truncation in the branch that is not taken:
fs/erofs/zdata.c: note: in included file (through /home/arnd/arm-soc/arch/arm/include/asm/cmpxchg.h, /home/arnd/arm-soc/arch/arm/include/asm/atomic.h, /home/arnd/arm-soc/include/linux/atomic.h, ...):
include/asm-generic/cmpxchg-local.h:29:33: warning: cast truncates bits from constant value (5f0ecafe becomes fe)
include/asm-generic/cmpxchg-local.h:33:34: warning: cast truncates bits from constant value (5f0ecafe becomes cafe)
include/asm-generic/cmpxchg-local.h:29:33: warning: cast truncates bits from constant value (5f0ecafe becomes fe)
include/asm-generic/cmpxchg-local.h:30:42: warning: cast truncates bits from constant value (5f0edead becomes ad)
include/asm-generic/cmpxchg-local.h:33:34: warning: cast truncates bits from constant value (5f0ecafe becomes cafe)
include/asm-generic/cmpxchg-local.h:34:44: warning: cast truncates bits from constant value (5f0edead becomes dead)
This was reported as a regression to Matt's recent __generic_cmpxchg_local
patch, though this patch only added more warnings on top of the ones
that were already there.
Rewording the truncation to use an explicit bitmask instead of a cast
to a smaller type avoids the warning but otherwise leaves the code
unchanged.
I had another look at why the cast is even needed for atomic_cmpxchg(),
and as Matt describes the problem here is that atomic_t contains a
signed 'int', but cmpxchg() takes an 'unsigned long' argument, and
converting between the two leads to a 64-bit sign-extension of
negative 32-bit atomics.
I checked the other implementations of arch_cmpxchg() and did not find
any others that run into the same problem as __generic_cmpxchg_local(),
but it's easy to be on the safe side here and always convert the
signed int into an unsigned int when calling arch_cmpxchg(), as this
will work even when any of the arch_cmpxchg() implementations run
into the same problem.
Fixes: 6246541522 ("locking/atomic: cmpxchg: Make __generic_cmpxchg_local compare against zero-extended 'old' value")
Reviewed-by: Matt Evans <mev@rivosinc.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Copy the forced type casts from the normal MMIO accessors to suppress
the sparse warnings that point out __raw_readl() returns a native endian
word (just like readl()).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Commit c1d55d5013 ("asm-generic/io.h: Fix sparse warnings on
big-endian architectures") missed fixing the 64-bit accessors.
Arnd explains in the attached link why the casts are necessary, even if
__raw_readq() and __raw_writeq() do not take endian-specific types.
Link: https://lore.kernel.org/lkml/9105d6fc-880b-4734-857d-e3d30b87ccf6@app.fastmail.com/
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reset the FDIR counters when FDIR inits. Without this patch,
when VF initializes or resets, all the FDIR counters are not
cleaned, which may cause unexpected behaviors for future FDIR
rule create (e.g., rule conflict).
Fixes: 1f7ea1cd6a ("ice: Enable FDIR Configure for AVF")
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Lingyu Liu <lingyu.liu@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When adding a FDIR filter, if ice_vc_fdir_set_irq_ctx returns failure,
the inserted fdir entry will not be removed and if ice_vc_fdir_write_fltr
returns failure, the fdir context info for irq handler will not be cleared
which may lead to inconsistent or memory leak issue. This patch refines
failure cases to resolve this issue.
Fixes: 1f7ea1cd6a ("ice: Enable FDIR Configure for AVF")
Signed-off-by: Simei Su <simei.su@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently callback request does not use the credential specified in
CREATE_SESSION if the security flavor for the back channel is AUTH_SYS.
Problem was discovered by pynfs 4.1 DELEG5 and DELEG7 test with error:
DELEG5 st_delegation.testCBSecParms : FAILURE
expected callback with uid, gid == 17, 19, got 0, 0
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Fixes: 8276c902bb ("SUNRPC: remove uid and gid from struct auth_cred")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>