Commit Graph

35 Commits

Author SHA1 Message Date
Dave Jiang
3381586a40 cxl: Fix compile warning for cxl_security_ops extern
Jonathan reported he has observed compiler warning when using running with
W=1 C=1 for cxl_security_ops that is declared as an extern in cxl/pmem.c.
Move to cxl.h to make it visible to all cxl sources.

Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/linux-cxl/167771196186.3285982.18283746206612049722.stgit@djiang5-mobl3.local/
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30 10:43:48 -07:00
Dan Williams
59f8d15107 cxl/mbox: Move mailbox related driver state to its own data structure
'struct cxl_dev_state' makes too many assumptions about the capabilities
of a CXL device. In particular it assumes a CXL device has a mailbox and
all of the infrastructure and state that comes along with that.

In preparation for supporting accelerator / Type-2 devices that may not
have a mailbox and in general maintain a minimal core context structure,
make mailbox functionality a super-set of  'struct cxl_dev_state' with
'struct cxl_memdev_state'.

With this reorganization it allows for CXL devices that support HDM
decoder mapping, but not other general-expander / Type-3 capabilities,
to only enable that subset without the rest of the mailbox
infrastructure coming along for the ride.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168679260240.3436160.15520641540463704524.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 14:31:08 -07:00
Dan Williams
f57aec443c cxl/pmem: Fix nvdimm registration races
A loop of the form:

    while true; do modprobe cxl_pci; modprobe -r cxl_pci; done

...fails with the following crash signature:

    BUG: kernel NULL pointer dereference, address: 0000000000000040
    [..]
    RIP: 0010:cxl_internal_send_cmd+0x5/0xb0 [cxl_core]
    [..]
    Call Trace:
     <TASK>
     cxl_pmem_ctl+0x121/0x240 [cxl_pmem]
     nvdimm_get_config_data+0xd6/0x1a0 [libnvdimm]
     nd_label_data_init+0x135/0x7e0 [libnvdimm]
     nvdimm_probe+0xd6/0x1c0 [libnvdimm]
     nvdimm_bus_probe+0x7a/0x1e0 [libnvdimm]
     really_probe+0xde/0x380
     __driver_probe_device+0x78/0x170
     driver_probe_device+0x1f/0x90
     __device_attach_driver+0x85/0x110
     bus_for_each_drv+0x7d/0xc0
     __device_attach+0xb4/0x1e0
     bus_probe_device+0x9f/0xc0
     device_add+0x445/0x9c0
     nd_async_device_register+0xe/0x40 [libnvdimm]
     async_run_entry_fn+0x30/0x130

...namely that the bottom half of async nvdimm device registration runs
after the CXL has already torn down the context that cxl_pmem_ctl()
needs. Unlike the ACPI NFIT case that benefits from launching multiple
nvdimm device registrations in parallel from those listed in the table,
CXL is already marked PROBE_PREFER_ASYNCHRONOUS. So provide for a
synchronous registration path to preclude this scenario.

Fixes: 21083f5152 ("cxl/pmem: Register 'pmem' / cxl_nvdimm devices")
Cc: <stable@vger.kernel.org>
Reported-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-13 17:01:05 -08:00
Dan Williams
19398821b2 cxl/pmem: Fix nvdimm unregistration when cxl_pmem driver is absent
The cxl_pmem.ko module houses the driver for both cxl_nvdimm_bridge
objects and cxl_nvdimm objects. When the core creates a cxl_nvdimm it
arranges for it to be autoremoved when the bridge goes down. However, if
the bridge never initialized because the cxl_pmem.ko module never
loaded, it sets up a the following crash scenario:

    BUG: kernel NULL pointer dereference, address: 0000000000000478
    [..]
    RIP: 0010:cxl_nvdimm_probe+0x99/0x140 [cxl_pmem]
    [..]
    Call Trace:
     <TASK>
     cxl_bus_probe+0x17/0x50 [cxl_core]
     really_probe+0xde/0x380
     __driver_probe_device+0x78/0x170
     driver_probe_device+0x1f/0x90
     __driver_attach+0xd2/0x1c0
     bus_for_each_dev+0x79/0xc0
     bus_add_driver+0x1b1/0x200
     driver_register+0x89/0xe0
     cxl_pmem_init+0x50/0xff0 [cxl_pmem]

It turns out the recent rework to simplify nvdimm probing obviated the
need to unregister cxl_nvdimm objects at cxl_nvdimm_bridge ->remove()
time. Leave the cxl_nvdimm device registered until the hosting
cxl_memdev departs. The alternative is that the cxl_memdev needs to be
reattached whenever the cxl_nvdimm_bridge attach state cycles, which is
awkward and unnecessary.

The only requirement is to make sure that when the cxl_nvdimm_bridge
goes away any dependent cxl_nvdimm objects are shutdown. Handle that in
unregister_nvdimm_bus().

With these registration entanglements removed there is no longer a need
to pre-load the cxl_pmem module in cxl_acpi.

Fixes: cb9cfff82f ("cxl/acpi: Simplify cxl_nvdimm_bridge probing")
Reported-by: Gregory Price <gregory.price@memverge.com>
Debugged-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167426077263.3955046.9695309346988027311.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-25 15:35:26 -08:00
Dan Williams
5331cdf44d cxl/mbox: Enable cxl_mbox_send_cmd() users to validate output size
Internally cxl_mbox_send_cmd() converts all passed-in parameters to a
'struct cxl_mbox_cmd' instance and sends that to cxlds->mbox_send(). It
then teases the possibilty that the caller can validate the output size.
However, they cannot since the resulting output size is not conveyed to
the called. Fix that by making the caller pass in a constructed 'struct
cxl_mbox_cmd'. This prepares for a future patch to add output size
validation on a per-command basis.

Given the change in signature, also change the name to differentiate it
from the user command submission path that performs more validation
before generating the 'struct cxl_mbox_cmd' instance to execute.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/167030055370.4044561.17788093375112783036.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-06 14:36:02 -08:00
Dan Williams
95dddcb5e8 Merge branch 'for-6.2/cxl-security' into for-6.2/cxl
Pick CXL PMEM security commands for v6.2. Resolve conflicts with the
removal of the cxl_pmem_wq.
2022-12-05 12:30:38 -08:00
Dave Jiang
b5807c80b5 cxl: add dimm_id support for __nvdimm_create()
Set the cxlds->serial as the dimm_id to be fed to __nvdimm_create(). The
security code uses that as the key description for the security key of the
memory device. The nvdimm unlock code cannot find the respective key
without the dimm_id.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166863357043.80269.4337575149671383294.stgit@djiang5-desk3.ch.intel.com
Link: https://lore.kernel.org/r/166983620459.2734609.10175456773200251184.stgit@djiang5-desk3.ch.intel.com
Link: https://lore.kernel.org/r/166993219918.1995348.10786511454826454601.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-02 23:52:32 -08:00
Dan Williams
03ff079aa6 cxl/pmem: Remove the cxl_pmem_wq and related infrastructure
Now that cxl_nvdimm and cxl_pmem_region objects are torn down
sychronously with the removal of either the bridge, or an endpoint, the
cxl_pmem_wq infrastructure can be jettisoned.

Tested-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/166993042335.1882361.17022872468068436287.stgit@dwillia2-xfh.jf.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-02 23:07:22 -08:00
Dan Williams
f17b558d66 cxl/pmem: Refactor nvdimm device registration, delete the workqueue
The three objects 'struct cxl_nvdimm_bridge', 'struct cxl_nvdimm', and
'struct cxl_pmem_region' manage CXL persistent memory resources. The
bridge represents base platform resources, the nvdimm represents one or
more endpoints, and the region is a collection of nvdimms that
contribute to an assembled address range.

Their relationship is such that a region is torn down if any component
endpoints are removed. All regions and endpoints are torn down if the
foundational bridge device goes down.

A workqueue was deployed to manage these interdependencies, but it is
difficult to reason about, and fragile. A recent attempt to take the CXL
root device lock in the cxl_mem driver was reported by lockdep as
colliding with the flush_work() in the cxl_pmem flows.

Instead of the workqueue, arrange for all pmem/nvdimm devices to be torn
down immediately and hierarchically. A similar change is made to both
the 'cxl_nvdimm' and 'cxl_pmem_region' objects. For bisect-ability both
changes are made in the same patch which unfortunately makes the patch
bigger than desired.

Arrange for cxl_memdev and cxl_region to register a cxl_nvdimm and
cxl_pmem_region as a devres release action of the bridge device.
Additionally, include a devres release action of the cxl_memdev or
cxl_region device that triggers the bridge's release action if an endpoint
exits before the bridge. I.e. this allows either unplugging the bridge,
or unplugging and endpoint to result in the same cleanup actions.

To keep the patch smaller the cleanup of the now defunct workqueue
infrastructure is saved for a follow-on patch.

Tested-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/166993041773.1882361.16444301376147207609.stgit@dwillia2-xfh.jf.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-02 23:07:22 -08:00
Dan Williams
16d53cb0d6 cxl/region: Drop redundant pmem region release handling
Now that a cxl_nvdimm object can only experience ->remove() via an
unregistration event (because the cxl_nvdimm bind attributes are
suppressed), additional cleanups are possible.

It is already the case that the removal of a cxl_memdev object triggers
->remove() on any associated region. With that mechanism in place there
is no need for the cxl_nvdimm removal to trigger the same. Just rely on
cxl_region_detach() to tear down the whole cxl_pmem_region.

Tested-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/166993041215.1882361.6321535567798911286.stgit@dwillia2-xfh.jf.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-02 23:06:29 -08:00
Dan Williams
cb9cfff82f cxl/acpi: Simplify cxl_nvdimm_bridge probing
The 'struct cxl_nvdimm_bridge' object advertises platform CXL PMEM
resources. It coordinates with libnvdimm to attach nvdimm devices and
regions for each corresponding CXL object. That coordination is
complicated, i.e. difficult to reason about, and it turns out redundant.
It is already the case that the CXL core knows how to tear down a
cxl_region when a cxl_memdev goes through ->remove(), so that pathway
can be extended to directly cleanup cxl_nvdimm and cxl_pmem_region
objects.

Towards the goal of ripping out the cxl_nvdimm_bridge state machine,
arrange for cxl_acpi to optionally pre-load the cxl_pmem driver so that
the nvdimm bridge is active synchronously with
devm_cxl_add_nvdimm_bridge(), and remove all the bind attributes for the
cxl_nvdimm* objects since the cxl root device and cxl_memdev bind
attributes are sufficient.

Tested-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/166993040668.1882361.7450361097265836752.stgit@dwillia2-xfh.jf.intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-01 15:52:36 -08:00
Dave Jiang
452996fa07 cxl/pmem: add provider name to cxl pmem dimm attribute group
Add provider name in order to associate cxl test dimm from cxl_test to the
cxl pmem device when going through sysfs for security testing.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166983618174.2734609.15600031015423828810.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-01 12:42:35 -08:00
Dave Jiang
bd429e5355 cxl/pmem: add id attribute to CXL based nvdimm
Add an id group attribute for CXL based nvdimm object. The addition allows
ndctl to display the "unique id" for the nvdimm. The serial number for the
CXL memory device will be used for this id.

[
  {
      "dev":"nmem10",
      "id":"0x4",
      "security":"disabled"
  },
]

The id attribute is needed by the ndctl security key management to setup a
keyblob with a unique file name tied to the mem device.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166983617029.2734609.8251308562882142281.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-12-01 12:42:35 -08:00
Dave Jiang
3282811555 cxl/pmem: Introduce nvdimm_security_ops with ->get_flags() operation
Add nvdimm_security_ops support for CXL memory device with the introduction
of the ->get_flags() callback function. This is part of the "Persistent
Memory Data-at-rest Security" command set for CXL memory device support.
The ->get_flags() function provides the security state of the persistent
memory device defined by the CXL 3.0 spec section 8.2.9.8.6.1.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/166983609611.2734609.13231854299523325319.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-30 16:30:47 -08:00
Dan Williams
4d07ae22e7 cxl/pmem: Fix cxl_pmem_region and cxl_memdev leak
When a cxl_nvdimm object goes through a ->remove() event (device
physically removed, nvdimm-bridge disabled, or nvdimm device disabled),
then any associated regions must also be disabled. As highlighted by the
cxl-create-region.sh test [1], a single device may host multiple
regions, but the driver was only tracking one region at a time. This
leads to a situation where only the last enabled region per nvdimm
device is cleaned up properly. Other regions are leaked, and this also
causes cxl_memdev reference leaks.

Fix the tracking by allowing cxl_nvdimm objects to track multiple region
associations.

Cc: <stable@vger.kernel.org>
Link: https://github.com/pmem/ndctl/blob/main/test/cxl-create-region.sh [1]
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Fixes: 04ad63f086 ("cxl/region: Introduce cxl_pmem_region objects")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/166752183647.947915.2045230911503793901.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-04 15:58:35 -07:00
Yu Zhe
4f1aa35f1f cxl/pmem: Use size_add() against integer overflow
"struct_size() + n" may cause a integer overflow,
use size_add() to handle it.

Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Link: https://lore.kernel.org/r/20220927070247.23148-1-yuzhe@nfschina.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-03 11:20:46 -07:00
Jonathan Cameron
f010c75c05 cxl/pmem: Fix failure to account for 8 byte header for writes to the device LSA.
Writes to the device must include an offset and size as defined in
CXL 2.0 8.2.9.5.2.4 Set LSA (Opcode 4103h)

Fixes tag is non obvious as this code has been through several
reworks and variable names + wasn't in use until the addition
of the region code.

Due to a bug in QEMU CXL emulation this overrun resulted in QEMU
crashing.

Reported-by: Bobo WL <lmw.bobo@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Fixes: 60b8f17215 ("cxl/pmem: Translate NVDIMM label commands to CXL label commands")
Link: https://lore.kernel.org/r/20220815154044.24733-3-Jonathan.Cameron@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-10-20 16:28:53 -07:00
Dan Carpenter
9fd2cf4d6f cxl/region: Fix IS_ERR() vs NULL check
The nvdimm_pmem_region_create() function returns NULL on error.  It does
not return error pointers.

Fixes: 04ad63f086 ("cxl/region: Introduce cxl_pmem_region objects")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/Yuo65lq2WtfdGJ0X@kili
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-08-05 08:41:02 -07:00
Dan Williams
04ad63f086 cxl/region: Introduce cxl_pmem_region objects
The LIBNVDIMM subsystem is a platform agnostic representation of system
NVDIMM / persistent memory resources. To date, the CXL subsystem's
interaction with LIBNVDIMM has been to register an nvdimm-bridge device
and cxl_nvdimm objects to proxy CXL capabilities into existing LIBNVDIMM
subsystem mechanics.

With regions the approach is the same. Create a new cxl_pmem_region
object to proxy CXL region details into a LIBNVDIMM definition. With
this enabling LIBNVDIMM can partition CXL persistent memory regions with
legacy namespace labels. A follow-on patch will add CXL region label and
CXL namespace label support to persist region configurations across
driver reload / system-reset events.

Co-developed-by: Ben Widawsky <bwidawsk@kernel.org>
Signed-off-by: Ben Widawsky <bwidawsk@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165784340111.1758207.3036498385188290968.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-26 12:23:01 -07:00
Dan Williams
99183d26ed cxl/pmem: Fix offline_nvdimm_bus() to offline by bridge
Be careful to only disable cxl_pmem objects related to a given
cxl_nvdimm_bridge. Otherwise, offline_nvdimm_bus() reaches across CXL
domains and disables more than is expected.

Fixes: 21083f5152 ("cxl/pmem: Register 'pmem' / cxl_nvdimm devices")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165784339569.1758207.1557084545278004577.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-26 12:23:01 -07:00
Alison Schofield
8a66487506 cxl/mbox: Use __le32 in get,set_lsa mailbox structures
CXL specification defines these as little endian.

Fixes: 60b8f17215 ("cxl/pmem: Translate NVDIMM label commands to CXL label commands")
Reported-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/20220225221456.1025635-1-alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-06-21 14:09:00 -07:00
Dan Williams
38a34e1076 cxl: Drop cxl_device_lock()
Now that all CXL subsystem locking is validated with custom lock
classes, there is no need for the custom usage of the lockdep_mutex.

Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/165055520383.3745911.53447786039115271.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-28 14:01:55 -07:00
Alison Schofield
6aa657f416 cxl/pmem: Remove CXL SET_PARTITION_INFO from exclusive_cmds list
With SET_PARTITION_INFO on the exclusive_cmds list for the CXL_PMEM
driver, userspace cannot execute a set-partition command without
first unbinding the pmem driver from the device.

When userspace requests a partition change to take effect on the
next reboot this unbind requirement is unnecessarily restrictive.
The driver does not need to enforce an unbind because partitions
will not change until the next reboot. Of course, userspace still
needs to be aware that changing the size of persistent capacity
on the next reboot will result in the loss of data stored. That
can happen regardless of whether it is presently bound at the time
of issuing the set-partition command.

When userspace requests a partition change to take effect immediately,
restrictions are needed. The CXL_MEM driver currently blocks the usage
of immediate mode, making the presence of SET_PARTITION_INFO, in this
exclusive commands list, redundant.

In the future, when the CXL_MEM driver adds support for immediate
changes to device partitions it will ensure that the partition change
will not affect any active decode. That means the work will not fall
right back here, onto the CXL_PMEM driver.

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/accc6abc878f0662093b81490a1a052f2ff6f06e.1648687552.git.alison.schofield@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-12 16:07:01 -07:00
Dan Williams
3c5b903955 cxl: Prove CXL locking
When CONFIG_PROVE_LOCKING is enabled the 'struct device' definition gets
an additional mutex that is not clobbered by
lockdep_set_novalidate_class() like the typical device_lock(). This
allows for local annotation of subsystem locks with mutex_lock_nested()
per the subsystem's object/lock hierarchy. For CXL, this primarily needs
the ability to lock ports by depth and child objects of ports by their
parent parent-port lock.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164365853422.99383.1052399160445197427.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-02-08 22:57:29 -08:00
Dan Williams
53989fad12 cxl/pmem: Fix module reload vs workqueue state
A test of the form:

    while true; do modprobe -r cxl_pmem; modprobe cxl_pmem; done

May lead to a crash signature of the form:

    BUG: unable to handle page fault for address: ffffffffc0660030
    #PF: supervisor instruction fetch in kernel mode
    #PF: error_code(0x0010) - not-present page
    [..]
    Workqueue: cxl_pmem 0xffffffffc0660030
    RIP: 0010:0xffffffffc0660030
    Code: Unable to access opcode bytes at RIP 0xffffffffc0660006.
    [..]
    Call Trace:
     ? process_one_work+0x4ec/0x9c0
     ? pwq_dec_nr_in_flight+0x100/0x100
     ? rwlock_bug.part.0+0x60/0x60
     ? worker_thread+0x2eb/0x700

In that report the 0xffffffffc0660030 address corresponds to the former
function address of cxl_nvb_update_state() from a previous load of the
module, not the current address. Fix that by arranging for ->state_work
in the 'struct cxl_nvdimm_bridge' object to be reinitialized on cxl_pmem
module reload.

Details:

Recall that CXL subsystem wants to link a CXL memory expander device to
an NVDIMM sub-hierarchy when both a persistent memory range has been
registered by the CXL platform driver (cxl_acpi) *and* when that CXL
memory expander has published persistent memory capacity (Get Partition
Info). To this end the cxl_nvdimm_bridge driver arranges to rescan the
CXL bus when either of those conditions change. The helper
bus_rescan_devices() can not be called underneath the device_lock() for
any device on that bus, so the cxl_nvdimm_bridge driver uses a workqueue
for the rescan.

Typically a driver allocates driver data to hold a 'struct work_struct'
for a driven device, but for a workqueue that may run after ->remove()
returns, driver data will have been freed. The 'struct
cxl_nvdimm_bridge' object holds the state and work_struct directly.
Unfortunately it was only arranging for that infrastructure to be
initialized once per device creation rather than the necessary once per
workqueue (cxl_pmem_wq) creation.

Introduce is_cxl_nvdimm_bridge() and cxl_nvdimm_bridge_reset() in
support of invalidating stale references to a recently destroyed
cxl_pmem_wq.

Cc: <stable@vger.kernel.org>
Fixes: 8fdcb1704f ("cxl/pmem: Add initial infrastructure for pmem support")
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/163665474585.3505991.8397182770066720755.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-11-15 11:03:00 -08:00
Ira Weiny
5e2411ae80 cxl/memdev: Change cxl_mem to a more descriptive name
The 'struct cxl_mem' object actually represents the state of a CXL
device within the driver. Comments indicating that 'struct cxl_mem' is a
device itself are incorrect. It is data layered on top of a CXL Memory
Expander class device. Rename it 'struct cxl_dev_state'. The 'struct'
cxl_memdev' structure represents a Linux CXL memory device object, and
it uses services and information provided by 'struct cxl_dev_state'.

Update the structure name, function names, and the kdocs to reflect the
real uses of this structure.

Some helper functions that were previously prefixed "cxl_mem_" are
renamed to just "cxl_".

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20211102202901.3675568-3-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-11-15 11:02:58 -08:00
Dan Williams
08b9e0ab8a cxl/pmem: Fix reference counting for delayed work
There is a potential race between queue_work() returning and the
queued-work running that could result in put_device() running before
get_device(). Introduce the cxl_nvdimm_bridge_state_work() helper that
takes the reference unconditionally, but drops it if no new work was
queued, to keep the references balanced.

Fixes: 8fdcb1704f ("cxl/pmem: Add initial infrastructure for pmem support")
Cc: <stable@vger.kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/163553734757.2509761.3305231863616785470.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-11-15 11:02:58 -08:00
Dan Williams
7d3eb23c4c tools/testing/cxl: Introduce a mock memory device + driver
Introduce an emulated device-set plus driver to register CXL memory
devices, 'struct cxl_memdev' instances, in the mock cxl_test topology.
This enables the development of HDM Decoder (Host-managed Device Memory
Decoder) programming flow (region provisioning) in an environment that
can be updated alongside the kernel as it gains more functionality.

Whereas the cxl_pci module looks for CXL memory expanders on the 'pci'
bus, the cxl_mock_mem module attaches to CXL expanders on the platform
bus emitted by cxl_test.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163116440099.2460985.10692549614409346604.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 14:09:34 -07:00
Dan Williams
49be6dd807 cxl/mbox: Move command definitions to common location
In preparation for cxl_test to mock responses to mailbox command
requests, move some definitions from core/mbox.c to cxlmem.h.

No functional changes intended.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163116439547.2460985.10457111177103589574.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 14:09:34 -07:00
Dan Williams
2e52b6256b cxl/pmem: Add support for multiple nvdimm-bridge objects
In preparation for a mocked unit test environment for CXL objects, allow
for multiple unique nvdimm-bridge objects.

For now, just allow multiple bridges to be registered. Later, when there
are multiple present, further updates are needed to
cxl_find_nvdimm_bridge() to identify which bridge is associated with
which CXL hierarchy for nvdimm registration.

Note that this does change the kernel device-name for the bridge object.
User space should not have any attachment to the device name at this
point as it is still early days in the CXL driver development.

Acked-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163164647007.2831228.2150246954620721526.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 13:47:10 -07:00
Dan Williams
60b8f17215 cxl/pmem: Translate NVDIMM label commands to CXL label commands
The LIBNVDIMM IOCTL UAPI calls back to the nvdimm-bus-provider to
translate the Linux command payload to the device native command format.
The LIBNVDIMM commands get-config-size, get-config-data, and
set-config-data, map to the CXL memory device commands device-identify,
get-lsa, and set-lsa. Recall that the label-storage-area (LSA) on an
NVDIMM device arranges for the provisioning of namespaces. Additionally
for CXL the LSA is used for provisioning regions as well.

The data from device-identify is already cached in the 'struct cxl_mem'
instance associated with @cxl_nvd, so that payload return is simply
crafted and no CXL command is issued. The conversion for get-lsa is
straightforward, but the conversion for set-lsa requires an allocation
to append the set-lsa header in front of the payload.

Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/163122524923.2534512.9431316965424264864.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 13:47:09 -07:00
Dan Williams
12f3856ad4 cxl/mbox: Add exclusive kernel command support
The CXL_PMEM driver expects exclusive control of the label storage area
space. Similar to the LIBNVDIMM expectation that the label storage area
is only writable from userspace when the corresponding memory device is
not active in any region, the expectation is the native CXL_PCI UAPI
path is disabled while the cxl_nvdimm for a given cxl_memdev device is
active in LIBNVDIMM.

Add the ability to toggle the availability of a given command for the
UAPI path. Use that new capability to shutdown changes to partitions and
the label storage area while the cxl_nvdimm device is actively proxying
commands for LIBNVDIMM.

Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/163164579468.2830966.6980053377428474263.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-09-21 13:44:57 -07:00
Ben Widawsky
5161a55c06 cxl: Move cxl_core to new directory
CXL core is growing, and it's already arguably unmanageable. To support
future growth, move core functionality to a new directory and rename the
file to represent just bus support. Future work will remove non-bus
functionality.

Note that mem.h is renamed to cxlmem.h to avoid a namespace collision
with the global ARCH=um mem.h header.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162792537866.368511.8915631504621088321.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-08-06 08:22:53 -07:00
Dan Williams
21083f5152 cxl/pmem: Register 'pmem' / cxl_nvdimm devices
While a memX device on /sys/bus/cxl represents a CXL memory expander
control interface, a pmemX device represents the persistent memory
sub-functionality. It bridges the CXL subystem to the libnvdimm nmemX
control interface.

With this skeleton ndctl can now see persistent memory devices on a
"CXL" bus. Later patches add support for translating libnvdimm native
commands to CXL commands.

# ndctl list -BDiu -b CXL
{
  "provider":"CXL",
  "dev":"ndbus1",
  "dimms":[
    {
      "dev":"nmem1",
      "state":"disabled"
    },
    {
      "dev":"nmem0",
      "state":"disabled"
    }
  ]
}

Given nvdimm_bus_unregister() removes all devices on an ndbus0 the
cxl_pmem infrastructure needs to arrange ->remove() to be triggered on
cxl_nvdimm devices to keep their enabled state synchronized with the
registration state of their corresponding device on the nvdimm_bus. In
other words, always arrange for cxl_nvdimm_driver.remove() to unregister
nvdimms from an nvdimm_bus ahead of the bus being unregistered.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162380012696.3039556.4293801691038740850.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-15 16:47:34 -07:00
Dan Williams
8fdcb1704f cxl/pmem: Add initial infrastructure for pmem support
Register an 'nvdimm-bridge' device to act as an anchor for a libnvdimm
bus hierarchy. Also, flesh out the cxl_bus definition to allow a
cxl_nvdimm_bridge_driver to attach to the bridge and trigger the
nvdimm-bus registration.

The creation of the bridge is gated on the detection of a PMEM capable
address space registered to the root. The bridge indirection allows the
libnvdimm module to remain unloaded on platforms without PMEM support.

Given that the probing of ACPI0017 is asynchronous to CXL endpoint
devices, and the expectation that CXL endpoint devices register other
PMEM resources on the 'CXL' nvdimm bus, a workqueue is added. The
workqueue is needed to run bus_rescan_devices() outside of the
device_lock() of the nvdimm-bridge device to rendezvous nvdimm resources
as they arrive. For now only the bus is taken online/offline in the
workqueue.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/162379909706.2993820.14051258608641140169.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-06-15 16:47:14 -07:00