linux/drivers/nvdimm
Aneesh Kumar K.V f537669978 libnvdimm/dax: Pick the right alignment default when creating dax devices
Allow arch to provide the supported alignments and use hugepage alignment only
if we support hugepage. Right now we depend on compile time configs whereas this
patch switch this to runtime discovery.

Architectures like ppc64 can have THP enabled in code, but then can have
hugepage size disabled by the hypervisor. This allows us to create dax devices
with PAGE_SIZE alignment in this case.

Existing dax namespace with alignment larger than PAGE_SIZE will fail to
initialize in this specific case. We still allow fsdax namespace initialization.

With respect to identifying whether to enable hugepage fault for a dax device,
if THP is enabled during compile, we default to taking hugepage fault and in dax
fault handler if we find the fault size > alignment we retry with PAGE_SIZE
fault size.

This also addresses the below failure scenario on ppc64

ndctl create-namespace --mode=devdax  | grep align
 "align":16777216,
 "align":16777216

cat /sys/devices/ndbus0/region0/dax0.0/supported_alignments
 65536 16777216

daxio.static-debug  -z -o /dev/dax0.0
  Bus error (core dumped)

  $ dmesg | tail
   lpar: Failed hash pte insert with error -4
   hash-mmu: mm: Hashing failure ! EA=0x7fff17000000 access=0x8000000000000006 current=daxio
   hash-mmu:     trap=0x300 vsid=0x22cb7a3 ssize=1 base psize=2 psize 10 pte=0xc000000501002b86
   daxio[3860]: bus error (7) at 7fff17000000 nip 7fff973c007c lr 7fff973bff34 code 2 in libpmem.so.1.0.0[7fff973b0000+20000]
   daxio[3860]: code: 792945e4 7d494b78 e95f0098 7d494b78 f93f00a0 4800012c e93f0088 f93f0120
   daxio[3860]: code: e93f00a0 f93f0128 e93f0120 e95f0128 <f9490000> e93f0088 39290008 f93f0110

The failure was due to guest kernel using wrong page size.

The namespaces created with 16M alignment will appear as below on a config with
16M page size disabled.

$ ndctl list -Ni
[
  {
    "dev":"namespace0.1",
    "mode":"fsdax",
    "map":"dev",
    "size":5351931904,
    "uuid":"fc6e9667-461a-4718-82b4-69b24570bddb",
    "align":16777216,
    "blockdev":"pmem0.1",
    "supported_alignments":[
      65536
    ]
  },
  {
    "dev":"namespace0.0",
    "mode":"fsdax",    <==== devdax 16M alignment marked disabled.
    "map":"mem",
    "size":5368709120,
    "uuid":"a4bdf81a-f2ee-4bc6-91db-7b87eddd0484",
    "state":"disabled"
  }
]

Cc: linux-mm@kvack.org
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/20190905154603.10349-8-aneesh.kumar@linux.ibm.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-09-24 10:23:41 -07:00
..
badrange.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
blk.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
btt_devs.c driver-core, libnvdimm: Let device subsystems add local lockdep coverage 2019-07-18 16:23:27 -07:00
btt.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
btt.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288 2019-06-05 17:36:37 +02:00
bus.c libnvdimm/pmem: Advance namespace seed for specific probe errors 2019-09-05 16:11:14 -07:00
claim.c libnvdimm: nd_region flush callback support 2019-07-05 15:19:10 -07:00
core.c driver-core, libnvdimm: Let device subsystems add local lockdep coverage 2019-07-18 16:23:27 -07:00
dax_devs.c libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields 2019-07-18 17:08:07 -07:00
dimm_devs.c libnvdimm/security: Consolidate 'security' operations 2019-08-29 13:51:57 -07:00
dimm.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
e820.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
Kconfig docs: nvdimm: add it to the driver-api book 2019-07-15 09:20:27 -03:00
label.c libnvdimm/label: Remove the dpa align check 2019-09-05 16:11:14 -07:00
label.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295 2019-06-05 17:36:38 +02:00
Makefile virtio-pmem: Add virtio pmem driver 2019-07-05 15:19:10 -07:00
namespace_devs.c libnvdimm: Use PAGE_SIZE instead of SZ_4K for align check 2019-09-05 16:11:14 -07:00
nd_virtio.c virtio_pmem: fix sparse warning 2019-07-16 19:44:26 -07:00
nd-core.h libnvdimm/region: Rewrite _probe_success() to _advance_seeds() 2019-09-05 16:11:14 -07:00
nd.h libnvdimm/dax: Pick the right alignment default when creating dax devices 2019-09-24 10:23:41 -07:00
of_pmem.c libnvdimm/of_pmem: Provide a unique name for bus provider 2019-08-13 20:31:57 -07:00
pfn_devs.c libnvdimm/dax: Pick the right alignment default when creating dax devices 2019-09-24 10:23:41 -07:00
pfn.h libnvdimm/pfn_dev: Add page size and struct page size to pfn superblock 2019-09-05 16:11:14 -07:00
pmem.c libnvdimm/pmem: Advance namespace seed for specific probe errors 2019-09-05 16:11:14 -07:00
pmem.h libnvdimm, pmem: Restore page attributes when clearing errors 2018-08-20 09:22:45 -07:00
region_devs.c libnvdimm: Use PAGE_SIZE instead of SZ_4K for align check 2019-09-05 16:11:14 -07:00
region.c driver-core, libnvdimm: Let device subsystems add local lockdep coverage 2019-07-18 16:23:27 -07:00
security.c libnvdimm/security: Consolidate 'security' operations 2019-08-29 13:51:57 -07:00
virtio_pmem.c virtio-pmem: Add virtio pmem driver 2019-07-05 15:19:10 -07:00
virtio_pmem.h virtio-pmem: Add virtio pmem driver 2019-07-05 15:19:10 -07:00