CXL memory expanders that support the CXL 2.0 memory device class code
include an "HDM Decoder Capability" mechanism to supplant the "CXL DVSEC
Range" mechanism originally defined in CXL 1.1. Both mechanisms depend
on a "mem_enable" bit being set in configuration space before either
mechanism activates. When the HDM Decoder Capability is enabled the CXL
DVSEC Range settings are ignored.
Previously, the cxl_mem driver was relying on platform-firmware to set
"mem_enable". That is an invalid assumption as there is no requirement
that platform-firmware sets the bit before the driver sees a device,
especially in hot-plug scenarios. Additionally, ACPI-platforms that
support CXL 2.0 devices also support the ACPI CEDT (CXL Early Discovery
Table). That table outlines the platform permissible address ranges for
CXL operation. So, there is a need for the driver to set "mem_enable",
and there is information available to determine the validity of the CXL
DVSEC Ranges.
Arrange for the driver to optionally enable the HDM Decoder Capability
if "mem_enable" was not set by platform firmware, or the CXL DVSEC Range
configuration was invalid. Be careful to only disable memory decode if
the kernel was the one to enable it. In other words, if CXL is backing
all of kernel memory at boot the device needs to maintain "mem_enable"
and "HDM Decoder enable" all the way up to handoff back to platform
firmware (e.g. ACPI S5 state entry may require CXL memory to stay
active).
Fixes: 560f785590 ("cxl/pci: Retrieve CXL DVSEC memory info")
Cc: Dan Carpenter <dan.carpenter@oracle.com>
[dan: fix early terminiation of range-allowed loop]
Cc: Ariel Sibley <ariel.sibley@microchip.com>
[ariel: Memory_size must be non-zero]
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165307136375.2499769.861793697156744166.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The port driver maps component registers for port operations. Reuse that
mapping for HDM Decoder Capability setup / enable. Move
devm_cxl_setup_hdm() before cxl_hdm_decode_init() and plumb @cxlhdm
through the hdm init helpers.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291691712.1426646.14336397551571515480.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The responsibility for establishing HDM Decoder Capability based
operation is more closely tied to port enabling than memdev enabling
which is concerned with port enumeration. This later enables reusing
@cxlhdm for probing / controlling "global enable" for the HDM Decoder
Capability. For now, just do the nominal move.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291691167.1426646.7936109077255288258.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Now that nothing external to cxl_hdm_decode_init() considers
'struct cxl_endpoint_dvec_info' move it internal to
cxl_hdm_decode_init().
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291690612.1426646.7866084245521113414.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In preparation for changing how the driver handles 'mem_enable' in the CXL
DVSEC control register. Merge the contents of cxl_hdm_decode_init() into
cxl_dvsec_ranges() and rename the combined function cxl_hdm_decode_init().
The possible cleanups and fixes that result from this merge are saved for a
follow-on change.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165291690027.1426646.10249756632415633752.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When a device does not have mem_enable set then the current range
settings are moot. Skip the enumeration and cause cxl_hdm_decode_init()
to proceed directly to enable the HDM Decoder Capability.
Fixes: 560f785590 ("cxl/pci: Retrieve CXL DVSEC memory info")
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291689442.1426646.18012291761753694336.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In preparation for fixing the setting of the 'mem_enabled' bit in CXL
DVSEC Control register, move all CXL DVSEC range enumeration into the
same source file.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291688886.1426646.15046138604010482084.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Allow cxl_await_media_ready() to be mocked for testing purposes rather
than carrying the maintenance burden of an indirect function call in the
mainline driver.
With the move cxl_await_media_ready() can no longer reuse the mailbox
timeout override, so add a media_ready_timeout module parameter to the
core to backfill.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291688340.1426646.4755627801983775011.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In preparation for validating DVSEC ranges against the platform declared
CXL memory ranges (ACPI CFMWS) move port enumeration before the
endpoint's decoder validation. Ultimately this logic will move to the
port driver, but create a bisect point before that larger move.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291687749.1426646.18091538443879226995.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The addition of cxl_mem_active() broke error exit scenarios for
cxl_mem_probe(). Return early rather than proceed with disabling
suspend, and update the label name since it is no longer a terminal
"out" label that exits the function.
Fixes: 9ea4dcf498 ("PM: CXL: Disable suspend")
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291687176.1426646.15449254938752532784.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
A check mem_info_valid already happens in __cxl_dvsec_ranges(). Rely on
that instead of calling wait_for_valid again.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291686632.1426646.7479581732894574486.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Now that wait_for_media() does nothing supplemental to
wait_for_media_ready() just promote wait_for_media_ready() to a common
helper and drop wait_for_media().
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291686046.1426646.4390664747934592185.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Media ready is asserted by the device independent of whether mem_enabled
was ever set. Drop this check to allow for dropping wait_for_media() in
favor of ->wait_media_ready().
Fixes: 8dd2bc0f8e ("cxl/mem: Add the cxl_mem driver")
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165291685501.1426646.10372821863672431074.stgit@dwillia2-xfh
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Now that all CXL subsystem locking is validated with custom lock
classes, there is no need for the custom usage of the lockdep_mutex.
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165055520383.3745911.53447786039115271.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The CXL "root" device, ACPI0017, is an attach point for coordinating
platform level CXL resources and is the parent device for a CXL port
topology tree. As such it has distinct locking rules relative to other
CXL subsystem objects, but because it is an ACPI device the lock class
is established well before it is given to the cxl_acpi driver.
However, the lockdep API does support changing the lock class "live" for
situations like this. Add a device_lock_set_class() helper that a driver
can use in ->probe() to set a custom lock class, and
device_lock_reset_class() to return to the default "no validate" class
before the custom lock class key goes out of scope after ->remove().
Note the helpers are all macros to support dead code elimination in the
CONFIG_PROVE_LOCKING=n case, however device_set_lock_class() still needs
#ifdef CONFIG_PROVE_LOCKING since lockdep_match_class() explicitly does
not have a helper in the CONFIG_PROVE_LOCKING=n case (see comment in
lockdep.h). The lockdep API needs 2 small tweaks to prevent "unused"
warnings for the @key argument to lock_set_class(), and a new
lock_set_novalidate_class() is added to supplement
lockdep_set_novalidate_class() in the cases where the lock class is
converted while the lock is held.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165100081305.1528964.11138612430659737238.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In response to an attempt to expand dev->lockdep_mutex for device_lock()
validation [1], Peter points out [2] that the lockdep API already has
the ability to assign a dedicated lock class per subsystem device-type.
Use lockdep_set_class() to override the default device_lock()
'__lockdep_no_validate__' class for each CXL subsystem device-type. This
enables lockdep to detect deadlocks and recursive locking within the
device-driver core and the subsystem. The
lockdep_set_class_and_subclass() API is used for port objects that
recursively lock the 'cxl_port_key' class by hierarchical topology
depth.
Link: https://lore.kernel.org/r/164982968798.684294.15817853329823976469.stgit@dwillia2-desk3.amr.corp.intel.com [1]
Link: https://lore.kernel.org/r/Ylf0dewci8myLvoW@hirez.programming.kicks-ass.net [2]
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165055519317.3745911.7342499516839702840.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This should be bitwise & instead of &&.
Fixes: 6179045ccc ("cxl/mbox: Block immediate mode in SET_PARTITION_INFO command")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/YmpgkbbQ1Yxu36uO@kili
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
vmemdup_user() returns an ERR_PTR() on failure. Use IS_ERR()
to check the return value.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20220407010915.1211258-1-alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Payload sizes for mailbox commands are expected to be positive values
coming from userspace. The documentation correctly describes these as
always unsigned values. The mailbox and send structures that support
the mailbox commands however, use __s32 types for the payloads.
Replace __s32 with __u32 in the mailbox and send command structures
and update usages.
Kernel users of the interface already block all negative values and
there is no known ability for userspace to have grown a dependency on
submitting negative values to the kernel. The known user of the IOCTL,
the CXL command line interface (cxl-cli) already enforces positive
size values.
A Smatch warning of a signedness uncovered this issue.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/20220414051246.1244575-1-alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The CXL specification claims S3 support at a hardware level, but at a
system software level there are some missing pieces. Section 9.4 (CXL
2.0) rightly claims that "CXL mem adapters may need aux power to retain
memory context across S3", but there is no enumeration mechanism for the
OS to determine if a given adapter has that support. Moreover the save
state and resume image for the system may inadvertantly end up in a CXL
device that needs to be restored before the save state is recoverable.
I.e. a circular dependency that is not resolvable without a third party
save-area.
Arrange for the cxl_mem driver to fail S3 attempts. This still nominaly
allows for suspend, but requires unbinding all CXL memory devices before
the suspend to ensure the typical DRAM flow is taken. The cxl_mem unbind
flow is intended to also tear down all CXL memory regions associated
with a given cxl_memdev.
It is reasonable to assume that any device participating in a System RAM
range published in the EFI memory map is covered by aux power and
save-area outside the device itself. So this restriction can be
minimized in the future once pre-existing region enumeration support
arrives, and perhaps a spec update to clarify if the EFI memory map is
sufficent for determining the range of devices managed by
platform-firmware for S3 support.
Per Rafael, if the CXL configuration prevents suspend then it should
fail early before tasks are frozen, and mem_sleep should stop showing
'mem' as an option [1]. Effectively CXL augments the platform suspend
->valid() op since, for example, the ACPI ops are not aware of the CXL /
PCI dependencies. Given the split role of platform firmware vs OS
provisioned CXL memory it is up to the cxl_mem driver to determine if
the CXL configuration has elements that platform firmware may not be
prepared to restore.
Link: https://lore.kernel.org/r/CAJZ5v0hGVN_=3iU8OLpHY3Ak35T5+JcBM-qs8SbojKrpd0VXsA@mail.gmail.com [1]
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Len Brown <len.brown@intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/165066828317.3907920.5690432272182042556.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
cxl_mem_probe() already emits a log message when HDM operation can not
be established. Delete the similar one in cxl_hdm_decode_init().
What is less obvious is why global_ctrl being enabled makes positive
values of info->ranges irrelevant, and the Linux behavior with respect
to the spec recommendation to mirror CXL Range registers with HDM
Decoder Base + Size registers.
Cc: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/164944616743.454665.7055846627973202403.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
cxl_dvsec_decode_init() is tasked with checking whether legacy DVSEC
range based decode is in effect, or whether HDM can be enabled / already
is enabled. As such it either succeeds or fails and that result is the
return value. The @do_hdm_init variable is misleading in the case where
HDM operation is already found to be active, so just call it @retval.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/164730736435.3806189.2537160791687837469.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
cxl_dvsec_ranges(), the helper for enumerating the presence of an active
legacy CXL.mem configuration on a CXL 2.0 Memory Expander, is not fatal
for cxl_pci because there is still value to enable mailbox operations
even if CXL.mem operation is disabled. Recall that the reason cxl_pci
does this initialization and not cxl_mem is to preserve the useful
property (for unit testing) that cxl_mem is cxl_memdev + mmio generic,
and does not require access to a 'struct pci_dev' to issue config
cycles.
Update 'struct cxl_endpoint_dvsec_info' to carry either a positive
number of non-zero size legacy CXL DVSEC ranges, or the negative error
code from __cxl_dvsec_ranges() in its @ranges member.
Reported-by: Krzysztof Zach <krzysztof.zach@intel.com>
Fixes: 560f785590 ("cxl/pci: Retrieve CXL DVSEC memory info")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/164730735869.3806189.4032428192652531946.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In preparation for the cxl_pci driver to continue operation after
cxl_dvsec_range() failure, update cxl_mem to check for negative error
codes in info->ranges. Treat that condition as fatal regardless of the
state of the HDM configuration since cxl_mem needs positive confirmation
that legacy ranges were not established by platform firmware or another
agent.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com.
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/164730735324.3806189.4167509857771192422.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In preparation for not treating DVSEC range initialization failures as
fatal to cxl_pci_probe() add individual dev_dbg() statements for each of
the major failure reasons in cxl_dvsec_ranges().
The rationale for cxl_dvsec_ranges() failure not being fatal is that
there is still value for cxl_pci to enable mailbox operations even if
CXL.mem operation is disabled.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/164730734812.3806189.2726330688692684104.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
When the driver finds legacy DVSEC ranges active on a CXL Memory
Expander it indicates that platform firmware is not aware of, or is
deliberately disabling common CXL 2.0 operation. In this case Linux
generally has no choice, but to leave the device alone.
The driver attempts to validate that the DVSEC range is in the EFI
memory map. Remove that logic since there is no requirement that the
BIOS publish DVSEC ranges in the EFI Memory Map.
In the future the driver will want to permanently reserve this capacity
out of the available CFMWS capacity and hide it from
request_free_mem_region(), but it serves no purpose to warn about the
range not appearing in the EFI Memory Map.
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164730734246.3806189.13995924771963139898.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Use the global cxl_mbox_cmd_rc table to improve debug messaging
in __cxl_pci_mbox_send_cmd() and allow cxl_mbox_send_cmd()
to map to proper kernel style errno codes - this patch
continues to use -ENXIO only so no change in semantics.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed by: Adam Manzanares <a.manzanares@samsung.com>
Link: https://lore.kernel.org/r/20220404021216.66841-5-dave@stgolabs.net
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Upon a completed command the caller is still expected to check
the actual return_code register to ensure it succeed. This
adds, per the spec, the potential command return codes. It maps
the hardware return code with the kernel's errno style, and by
default continues to use -ENXIO (Command completed, but device
reported an error).
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed by: Adam Manzanares <a.manzanares@samsung.com>
Link: https://lore.kernel.org/r/20220404021216.66841-4-dave@stgolabs.net
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Also mention the need for the caller to check against any
errors from the hardware in return_code.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed by: Adam Manzanares <a.manzanares@samsung.com>
Link: https://lore.kernel.org/r/20220404021216.66841-3-dave@stgolabs.net
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
... we have lockdep for this.
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed by: Adam Manzanares <a.manzanares@samsung.com>
Link: https://lore.kernel.org/r/20220404021216.66841-2-dave@stgolabs.net
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
With SET_PARTITION_INFO on the exclusive_cmds list for the CXL_PMEM
driver, userspace cannot execute a set-partition command without
first unbinding the pmem driver from the device.
When userspace requests a partition change to take effect on the
next reboot this unbind requirement is unnecessarily restrictive.
The driver does not need to enforce an unbind because partitions
will not change until the next reboot. Of course, userspace still
needs to be aware that changing the size of persistent capacity
on the next reboot will result in the loss of data stored. That
can happen regardless of whether it is presently bound at the time
of issuing the set-partition command.
When userspace requests a partition change to take effect immediately,
restrictions are needed. The CXL_MEM driver currently blocks the usage
of immediate mode, making the presence of SET_PARTITION_INFO, in this
exclusive commands list, redundant.
In the future, when the CXL_MEM driver adds support for immediate
changes to device partitions it will ensure that the partition change
will not affect any active decode. That means the work will not fall
right back here, onto the CXL_PMEM driver.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/accc6abc878f0662093b81490a1a052f2ff6f06e.1648687552.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
User space may send the SET_PARTITION_INFO mailbox command using
the IOCTL interface. Inspect the input payload and fail if the
immediate flag is set.
This is the first instance of the driver inspecting an input payload
from user space. Assume there will be more such cases and implement
with an extensible helper.
In order for the kernel to react to an immediate partition change it
needs to assert that the change will not affect any active decode. At
a minimum this requires validating that the device is using HDM
decoders instead of the CXL DVSEC for decode, and that none of the
active HDM decoders are affected by the partition change. For now,
just fail until that support arrives.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/241821186c363833980adbc389e2c547bc5a6395.1648687552.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
cxl_validate_command_from_user() is now the single point of validation
for mailbox commands coming from user space. Previously, it returned a
a cxl_mem_command, but that was not sufficient when validation of the
actual mailbox command became a requirement. Now, it returns a fully
validated cxl_mbox_cmd.
Remove the extraneous cxl_mem_command parameter. Define and use a
local version only.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/c11a437896d914daf36f5ac8ec62f999c5ec2da7.1648687552.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Previously, handle_mailbox_cmd_from_user(), constructed the mailbox
command and dispatched it to the hardware. The construction work
has moved to the validation path.
handle_mailbox_cmd_from_user() now expects a fully validated
mbox param. Make it's caller, cxl_send_cmd(), deliver it. Update
the comments and dereferencing of the new mbox parameter.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/77050ba512d6c30eccf7505467509e460dd325a0.1648687552.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In preparation for removing access to struct cxl_mem_command,
change this debug message to use cxl_mbox_cmd fields instead.
Retrieve the pretty command name from cxl_mbox_cmd using a new
opcode to command name helper.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/57265751d336a6e95f5ca31a9c77189408b05742.1648687552.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This is a step in refactoring the handling of user space mailbox
commands. The intent is to have all the validation work originate
in cxl_validate_cmd_from_user().
Move the construction and validation of a mailbox command to the
validation path. Continue to pass both the out_cmd and the mbox_cmd
until handle_mbox_cmd_from_user() learns how to use a mbox_cmd param.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/c9fbdad968a2b619f9108bb6c37cef1a853cdf5a.1648687552.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
This move serves two purposes: 1) Emit the warning in the raw
command validation path, and 2) Remove the dependency on the
struct cxl_mem_command in handle_mailbox_cmd_from_user() in
preparation for a refactor of that function.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/df5f0e0ec8afa1f75299aa86b4226ab4479ef325.1648687552.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Sanitizing and constructing a cxl_mem_command from a userspace
command is part of the validation process prior to submitting
the command to a CXL device. Move this work to helper functions:
cxl_to_mem_cmd(), cxl_to_mem_cmd_raw().
This declutters cxl_validate_cmd_from_user() in preparation for
adding new validation steps.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/7d9b826f29262e3a484cb4bb7b63872134d60bd7.1648687552.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Fix the following coccicheck warning:
./drivers/cxl/core/port.c:913:21-24: ERROR: port is NULL but dereferenced.
The put_device() is only relevent in the is_cxl_root() case.
Fixes: 2703c16c75 ("cxl/core/port: Add switch port enumeration")
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Link: https://lore.kernel.org/r/20220307094158.404882-1-wanjiabing@vivo.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
KASAN + DEBUG_KOBJECT_RELEASE reports a potential use-after-free in
cxl_decoder_release() where it goes to reference its parent, a cxl_port,
to free its id back to port->decoder_ida.
BUG: KASAN: use-after-free in to_cxl_port+0x18/0x90 [cxl_core]
Read of size 8 at addr ffff888119270908 by task kworker/35:2/379
CPU: 35 PID: 379 Comm: kworker/35:2 Tainted: G OE 5.17.0-rc2+ #198
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Workqueue: events kobject_delayed_cleanup
Call Trace:
<TASK>
dump_stack_lvl+0x59/0x73
print_address_description.constprop.0+0x1f/0x150
? to_cxl_port+0x18/0x90 [cxl_core]
kasan_report.cold+0x83/0xdf
? to_cxl_port+0x18/0x90 [cxl_core]
to_cxl_port+0x18/0x90 [cxl_core]
cxl_decoder_release+0x2a/0x60 [cxl_core]
device_release+0x5f/0x100
kobject_cleanup+0x80/0x1c0
The device core only guarantees parent lifetime until all children are
unregistered. If a child needs a parent to complete its ->release()
callback that child needs to hold a reference to extend the lifetime of
the parent.
Fixes: 40ba17afdf ("cxl/acpi: Introduce cxl_decoder objects")
Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Tested-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164505751190.4175768.13324905271463416712.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
An endpoint can be unregistered via two paths. Either its parent port is
unregistered, or the memdev that registered the endpoint is removed. The
memdev remove path is responsible for synchronizing against the parent
->remove() event and if the memdev remove path wins, manually trigger
unregister_port() via devm_release_action(). Until that race is resolved
the memdev remove path holds a reference on the endpoint.
If the parent port for the endpoint can not be found that is an
indication that the endpoint has already been registered. Be sure to
drop the reference in all exit paths from delete_endpoint().
Fixes: 8dd2bc0f8e ("cxl/mem: Add the cxl_mem driver")
Link: https://lore.kernel.org/r/164454148209.3429624.12905500880311609053.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
If cxl_device_lock() is used on a non-CXL device the expectation is that
the lock class will fall back to CXL_ANON_LOCK. Instead it crashes when
trying to determine if the device is a 'decoder'. Specifically when the
device has a NULL type pointer. Just check for NULL before
de-referencing ->release.
Fixes: 3c5b903955 ("cxl: Prove CXL locking")
Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164439225406.2941117.3927102269866914339.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The device_lock_assert() in unregister_port() fails to pick the right
device leading to splats like the following from:
echo "ACPI0017:00" > /sys/bus/platform/drivers/cxl_acpi/unbind
WARNING: CPU: 32 PID: 1147 at include/linux/device.h:787 unregister_port+0x49/0x50 [cxl_c
[..]
RIP: 0010:unregister_port+0x49/0x50 [cxl_core]
[..]
Call Trace:
<TASK>
release_nodes+0x63/0x80
devres_release_all+0x8b/0xc0
__device_release_driver+0x190/0x240
device_driver_detach+0x3e/0xa0
unbind_store+0x113/0x130
Fix it up to assert on the device_lock() for ACPI0017 for root and 1st
level ports, and parent ports for all the rest.
Fixes: 54cdbf845c ("cxl/port: Add a driver for 'struct cxl_port' objects")
Reported-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164439224893.2941117.18331456248117887720.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In CXL 2.0, 8.2.5.1 CXL Capability Header Register: this register
is given as 32 bits.
8.2.3 which covers the CXL 2.0 Component registers, including the
CXL Capability Header Register states that access restrictions
specified in Section 8.2.2 apply.
8.2.2 includes:
* A 32 bit register shall be accessed as a 4 Byte quantity.
...
If these rules are not followed, the behavior is undefined.
Discovered during review of CXL QEMU emulation. Alex Bennée pointed
out there was a comment saying that 4 byte registers must be read
with a 4 byte read, but 8 byte reads were being emulated.
https://lore.kernel.org/qemu-devel/87bkzyd3c7.fsf@linaro.org/
Fixing that, led to this code failing. Whilst a given hardware
implementation 'might' work with an 8 byte read, it should not be relied
upon. The QEMU emulation v5 will return 0 and log the wrong access width.
The code moved, so one fixes tag for where this will directly apply and
also a reference to the earlier introduction of the code for backports.
Fixes: 0f06157e01 ("cxl/core: Move register mapping infrastructure")
Fixes: 08422378c4 ("cxl/pci: Add HDM decoder capabilities")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/20220201153437.2873-1-Jonathan.Cameron@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In case init_hdm_decoder() finds invalid settings, skip to the next
valid decoder. Only fail port enumeration if zero valid decoders are
found. This protects the driver init against broken hardware and / or
future interleave capabilities.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/164317464918.3438644.12371149695618136198.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Recall that a CXL Port is any object that publishes a CXL HDM Decoder
Capability structure. That is Host Bridge and Switches that have been
enabled so far. Now, add decoder support to the 'endpoint' CXL Ports
registered by the cxl_mem driver. They mostly share the same enumeration
as Bridges and Switches, but witout a target list. The target of
endpoint decode is device-internal DPA space, not another downstream
port.
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
[djbw: clarify changelog, hookup enumeration in the port driver]
Link: https://lore.kernel.org/r/164386092069.765089.14895687988217608642.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
In preparation for introducing endpoint decoder objects, move the
target_list attribute out of the common set since it has no meaning for
endpoint decoders.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164298430100.3018233.4715072508880290970.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>