From 5564597d51c8ff5b88d95c76255e18b13b760879 Mon Sep 17 00:00:00 2001 From: Paul Mackerras Date: Tue, 27 Nov 2018 09:01:54 +1100 Subject: [PATCH 01/14] powerpc: Fix COFF zImage booting on old powermacs Commit 6975a783d7b4 ("powerpc/boot: Allow building the zImage wrapper as a relocatable ET_DYN", 2011-04-12) changed the procedure descriptor at the start of crt0.S to have a hard-coded start address of 0x500000 rather than a reference to _zimage_start, presumably because having a reference to a symbol introduced a relocation which is awkward to handle in a position-independent executable. Unfortunately, what is at 0x500000 in the COFF image is not the first instruction, but the procedure descriptor itself, that is, a word containing 0x500000, which is not a valid instruction. Hence, booting a COFF zImage results in a "DEFAULT CATCH!, code=FFF00700" message from Open Firmware. This fixes the problem by (a) putting the procedure descriptor in the data section and (b) adding a branch to _zimage_start as the first instruction in the program. Fixes: 6975a783d7b4 ("powerpc/boot: Allow building the zImage wrapper as a relocatable ET_DYN") Signed-off-by: Paul Mackerras Signed-off-by: Michael Ellerman --- arch/powerpc/boot/crt0.S | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/boot/crt0.S b/arch/powerpc/boot/crt0.S index 32dfe6d083f3..9b9d17437373 100644 --- a/arch/powerpc/boot/crt0.S +++ b/arch/powerpc/boot/crt0.S @@ -15,7 +15,7 @@ RELA = 7 RELACOUNT = 0x6ffffff9 - .text + .data /* A procedure descriptor used when booting this as a COFF file. * When making COFF, this comes first in the link and we're * linked at 0x500000. @@ -23,6 +23,8 @@ RELACOUNT = 0x6ffffff9 .globl _zimage_start_opd _zimage_start_opd: .long 0x500000, 0, 0, 0 + .text + b _zimage_start #ifdef __powerpc64__ .balign 8 From 462951cd32e1496dc64b00051dfb777efc8ae5d8 Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Mon, 26 Nov 2018 12:59:16 +1100 Subject: [PATCH 02/14] powerpc/mm: Fix linux page tables build with some configs For some configs the build fails with: arch/powerpc/mm/dump_linuxpagetables.c: In function 'populate_markers': arch/powerpc/mm/dump_linuxpagetables.c:306:39: error: 'PKMAP_BASE' undeclared (first use in this function) arch/powerpc/mm/dump_linuxpagetables.c:314:50: error: 'LAST_PKMAP' undeclared (first use in this function) These come from highmem.h, including that fixes the build. Signed-off-by: Michael Ellerman --- arch/powerpc/mm/dump_linuxpagetables.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/mm/dump_linuxpagetables.c b/arch/powerpc/mm/dump_linuxpagetables.c index 2b74f8adf4d0..6aa41669ac1a 100644 --- a/arch/powerpc/mm/dump_linuxpagetables.c +++ b/arch/powerpc/mm/dump_linuxpagetables.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include From 78e7b15e17ac175e7eed9e21c6f92d03d3b0a6fa Mon Sep 17 00:00:00 2001 From: Radu Rendec Date: Tue, 27 Nov 2018 22:20:48 -0500 Subject: [PATCH 03/14] powerpc/msi: Fix NULL pointer access in teardown code The arch_teardown_msi_irqs() function assumes that controller ops pointers were already checked in arch_setup_msi_irqs(), but this assumption is wrong: arch_teardown_msi_irqs() can be called even when arch_setup_msi_irqs() returns an error (-ENOSYS). This can happen in the following scenario: - msi_capability_init() calls pci_msi_setup_msi_irqs() - pci_msi_setup_msi_irqs() returns -ENOSYS - msi_capability_init() notices the error and calls free_msi_irqs() - free_msi_irqs() calls pci_msi_teardown_msi_irqs() This is easier to see when CONFIG_PCI_MSI_IRQ_DOMAIN is not set and pci_msi_setup_msi_irqs() and pci_msi_teardown_msi_irqs() are just aliases to arch_setup_msi_irqs() and arch_teardown_msi_irqs(). The call to free_msi_irqs() upon pci_msi_setup_msi_irqs() failure seems legit, as it does additional cleanup; e.g. list_del(&entry->list) and kfree(entry) inside free_msi_irqs() do happen (MSI descriptors are allocated before pci_msi_setup_msi_irqs() is called and need to be cleaned up if that fails). Fixes: 6b2fd7efeb88 ("PCI/MSI/PPC: Remove arch_msi_check_device()") Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Radu Rendec Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/msi.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/msi.c b/arch/powerpc/kernel/msi.c index dab616a33b8d..f2197654be07 100644 --- a/arch/powerpc/kernel/msi.c +++ b/arch/powerpc/kernel/msi.c @@ -34,5 +34,10 @@ void arch_teardown_msi_irqs(struct pci_dev *dev) { struct pci_controller *phb = pci_bus_to_host(dev->bus); - phb->controller_ops.teardown_msi_irqs(dev); + /* + * We can be called even when arch_setup_msi_irqs() returns -ENOSYS, + * so check the pointer again. + */ + if (phb->controller_ops.teardown_msi_irqs) + phb->controller_ops.teardown_msi_irqs(dev); } From bf3d6afbb234156749b640b6c50f714967a85964 Mon Sep 17 00:00:00 2001 From: Benjamin Herrenschmidt Date: Fri, 30 Nov 2018 14:54:09 +1100 Subject: [PATCH 04/14] powerpc: Look for "stdout-path" when setting up legacy consoles Commit 78e5dfea84dc ("powerpc: dts: replace 'linux,stdout-path' with 'stdout-path'") broke the default console on a number of embedded PowerPC systems, because it failed to also update the code in arch/powerpc/kernel/legacy_serial.c to look for that property in addition to the old one. This fixes it. Fixes: 78e5dfea84dc ("powerpc: dts: replace 'linux,stdout-path' with 'stdout-path'") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Benjamin Herrenschmidt Reviewed-by: Rob Herring Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/legacy_serial.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/legacy_serial.c b/arch/powerpc/kernel/legacy_serial.c index 33b34a58fc62..5b9dce17f0c9 100644 --- a/arch/powerpc/kernel/legacy_serial.c +++ b/arch/powerpc/kernel/legacy_serial.c @@ -372,6 +372,8 @@ void __init find_legacy_serial_ports(void) /* Now find out if one of these is out firmware console */ path = of_get_property(of_chosen, "linux,stdout-path", NULL); + if (path == NULL) + path = of_get_property(of_chosen, "stdout-path", NULL); if (path != NULL) { stdout = of_find_node_by_path(path); if (stdout) @@ -595,8 +597,10 @@ static int __init check_legacy_serial_console(void) /* We are getting a weird phandle from OF ... */ /* ... So use the full path instead */ name = of_get_property(of_chosen, "linux,stdout-path", NULL); + if (name == NULL) + name = of_get_property(of_chosen, "stdout-path", NULL); if (name == NULL) { - DBG(" no linux,stdout-path !\n"); + DBG(" no stdout-path !\n"); return -ENODEV; } prom_stdout = of_find_node_by_path(name); From e41b93a6be57e26a4a123345f826a6ac3a213551 Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Mon, 3 Dec 2018 19:55:55 +1100 Subject: [PATCH 05/14] powerpc/boot: Fix build failures with -j 1 In commit 5e9dcb6188a4 ("powerpc/boot: Expose Kconfig symbols to wrapper") we added a dependency to serial.c on autoconf.h: $(obj)/serial.c: $(obj)/autoconf.h This works when building in-tree (ie. with KBUILD_OUTPUT unset) because the obj tree is the src tree. But when building with eg. O=build and -j 1 the build fails: gcc ... -I../arch/powerpc/boot -c -o arch/powerpc/boot/serial.o arch/powerpc/boot/serial.c gcc: error: arch/powerpc/boot/serial.c: No such file or directory Why this is only happening with -j 1 is not clear, when building with -j greater than 1 somehow we decide to look for serial.c in the src tree (../), eg: gcc -I../arch/powerpc/boot -c -o arch/powerpc/boot/serial.o ../arch/powerpc/boot/serial.c Regardless we shouldn't be specifying a dependency on serial.c in the build tree, we want to add a dependency to the version in $(srctree) so fix the rule to say that. Fixes: 5e9dcb6188a4 ("powerpc/boot: Expose Kconfig symbols to wrapper") Tested-by: Daniel Axtens Signed-off-by: Michael Ellerman --- arch/powerpc/boot/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile index 39354365f54a..ed9883169190 100644 --- a/arch/powerpc/boot/Makefile +++ b/arch/powerpc/boot/Makefile @@ -197,7 +197,7 @@ $(obj)/empty.c: $(obj)/zImage.coff.lds $(obj)/zImage.ps3.lds : $(obj)/%: $(srctree)/$(src)/%.S $(Q)cp $< $@ -$(obj)/serial.c: $(obj)/autoconf.h +$(srctree)/$(src)/serial.c: $(obj)/autoconf.h $(obj)/autoconf.h: $(obj)/%: $(objtree)/include/generated/% $(Q)cp $< $@ From a6460b03f945ee216dbf42a0d9ee78d52fd470c2 Mon Sep 17 00:00:00 2001 From: Sandipan Das Date: Thu, 6 Dec 2018 14:57:01 +0530 Subject: [PATCH 06/14] powerpc/bpf: Fix broken uapi for BPF_PROG_TYPE_PERF_EVENT Now that there are different variants of pt_regs for userspace and kernel, the uapi for the BPF_PROG_TYPE_PERF_EVENT program type must be changed by exporting the user_pt_regs structure instead of the pt_regs structure that is in-kernel only. Fixes: 002af9391bfb ("powerpc: Split user/kernel definitions of struct pt_regs") Signed-off-by: Sandipan Das Signed-off-by: Michael Ellerman --- arch/powerpc/include/asm/perf_event.h | 2 ++ arch/powerpc/include/uapi/asm/Kbuild | 1 - arch/powerpc/include/uapi/asm/bpf_perf_event.h | 9 +++++++++ 3 files changed, 11 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/include/uapi/asm/bpf_perf_event.h diff --git a/arch/powerpc/include/asm/perf_event.h b/arch/powerpc/include/asm/perf_event.h index 8bf1b6351716..16a49819da9a 100644 --- a/arch/powerpc/include/asm/perf_event.h +++ b/arch/powerpc/include/asm/perf_event.h @@ -26,6 +26,8 @@ #include #include +#define perf_arch_bpf_user_pt_regs(regs) ®s->user_regs + /* * Overload regs->result to specify whether we should use the MSR (result * is zero) or the SIAR (result is non zero). diff --git a/arch/powerpc/include/uapi/asm/Kbuild b/arch/powerpc/include/uapi/asm/Kbuild index a658091a19f9..3712152206f3 100644 --- a/arch/powerpc/include/uapi/asm/Kbuild +++ b/arch/powerpc/include/uapi/asm/Kbuild @@ -1,7 +1,6 @@ # UAPI Header export list include include/uapi/asm-generic/Kbuild.asm -generic-y += bpf_perf_event.h generic-y += param.h generic-y += poll.h generic-y += resource.h diff --git a/arch/powerpc/include/uapi/asm/bpf_perf_event.h b/arch/powerpc/include/uapi/asm/bpf_perf_event.h new file mode 100644 index 000000000000..b551b741653d --- /dev/null +++ b/arch/powerpc/include/uapi/asm/bpf_perf_event.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _UAPI__ASM_BPF_PERF_EVENT_H__ +#define _UAPI__ASM_BPF_PERF_EVENT_H__ + +#include + +typedef struct user_pt_regs bpf_user_pt_regs_t; + +#endif /* _UAPI__ASM_BPF_PERF_EVENT_H__ */ From 14ebfec0712f66a4ef037fb7ac0df6a600584356 Mon Sep 17 00:00:00 2001 From: Oliver O'Halloran Date: Fri, 7 Dec 2018 02:17:08 +1100 Subject: [PATCH 07/14] powerpc/papr_scm: Use depend instead of select Making PAPR_SCM select LIBNVDIMM results in circular dependencies in Kconfig when another symbol depends on it. Fix this by replacing the select with a depends. Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions") Reported-by: Alastair D'Silva Signed-off-by: Oliver O'Halloran Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/Kconfig | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 2e4bd32154b5..472b784f01eb 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -140,8 +140,7 @@ config IBMEBUS Bus device driver for GX bus based adapters. config PAPR_SCM - depends on PPC_PSERIES && MEMORY_HOTPLUG - select LIBNVDIMM + depends on PPC_PSERIES && MEMORY_HOTPLUG && LIBNVDIMM tristate "Support for the PAPR Storage Class Memory interface" help Enable access to hypervisor provided storage class memory. From 59613526117b0595cb7b04835390ecd5175f9cd4 Mon Sep 17 00:00:00 2001 From: Oliver O'Halloran Date: Fri, 7 Dec 2018 02:17:09 +1100 Subject: [PATCH 08/14] powerpc/papr_scm: Fix resource end address Fix an off-by-one error in the memory resource range. This resource is used to determine the address range of the memory to be hot-plugged as ZONE_DEVICE memory. The current end address results in the kernel attempting to map an additional memblock and the hypervisor may reject the mapping resulting in the entire hot-plug failing. Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/papr_scm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index ee9372b65ca5..390badd33547 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -296,7 +296,7 @@ static int papr_scm_probe(struct platform_device *pdev) /* setup the resource for the newly bound range */ p->res.start = p->bound_addr; - p->res.end = p->bound_addr + p->blocks * p->block_size; + p->res.end = p->bound_addr + p->blocks * p->block_size - 1; p->res.name = pdev->name; p->res.flags = IORESOURCE_MEM; From 683ec0e04ab7e2d86d2656c71322dfb2ebf063fc Mon Sep 17 00:00:00 2001 From: Oliver O'Halloran Date: Fri, 7 Dec 2018 02:17:10 +1100 Subject: [PATCH 09/14] powerpc/papr_scm: Update DT properties The ibm,unit-sizes property was originally specified as an array of two u32s corresponding to the memory block size, and the number of blocks available in that region. A fairly last-minute change to the SCM DT specification was splitting that into two seperate u64 properties: ibm,block-sizes and ibm,number-of-blocks that convey the same information. No firmware / hypervisor that emitted the ibm,unit-size property ever appeared in the wild. Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran [mpe: Use kernel types (u32/u64)] Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/papr_scm.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index 390badd33547..de47e3719c8d 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -257,8 +257,9 @@ err: nvdimm_bus_unregister(p->bus); static int papr_scm_probe(struct platform_device *pdev) { - uint32_t drc_index, metadata_size, unit_cap[2]; struct device_node *dn = pdev->dev.of_node; + u32 drc_index, metadata_size; + u64 blocks, block_size; struct papr_scm_priv *p; int rc; @@ -268,8 +269,13 @@ static int papr_scm_probe(struct platform_device *pdev) return -ENODEV; } - if (of_property_read_u32_array(dn, "ibm,unit-capacity", unit_cap, 2)) { - dev_err(&pdev->dev, "%pOF: missing unit-capacity!\n", dn); + if (of_property_read_u64(dn, "ibm,block-size", &block_size)) { + dev_err(&pdev->dev, "%pOF: missing block-size!\n", dn); + return -ENODEV; + } + + if (of_property_read_u64(dn, "ibm,number-of-blocks", &blocks)) { + dev_err(&pdev->dev, "%pOF: missing number-of-blocks!\n", dn); return -ENODEV; } @@ -282,8 +288,8 @@ static int papr_scm_probe(struct platform_device *pdev) p->dn = dn; p->drc_index = drc_index; - p->block_size = unit_cap[0]; - p->blocks = unit_cap[1]; + p->block_size = block_size; + p->blocks = blocks; /* might be zero */ p->metadata_size = metadata_size; From 409dd7dc83eb54c4bc156aea890cc95bc21dc6f0 Mon Sep 17 00:00:00 2001 From: Oliver O'Halloran Date: Fri, 7 Dec 2018 02:17:11 +1100 Subject: [PATCH 10/14] powerpc/papr_scm: Remove endian conversions The return values of a h-call are returned in the CPU registers and written to the provided buffer by the plpar_hcall() wrapper. As a result the values written to memory are always in the native endian and should not be byte swapped. The inital implementation of the H-Call interface was done in qemu and the returned values were byte swapped unnecessarily in both the hypervisor and in the driver so this was only noticed when bringing up the PowerVM implementation. Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/papr_scm.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index de47e3719c8d..8b5153c55da6 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -55,7 +55,7 @@ static int drc_pmem_bind(struct papr_scm_priv *p) do { rc = plpar_hcall(H_SCM_BIND_MEM, ret, p->drc_index, 0, p->blocks, BIND_ANY_ADDR, token); - token = be64_to_cpu(ret[0]); + token = ret[0]; cond_resched(); } while (rc == H_BUSY); @@ -64,7 +64,7 @@ static int drc_pmem_bind(struct papr_scm_priv *p) return -ENXIO; } - p->bound_addr = be64_to_cpu(ret[1]); + p->bound_addr = ret[1]; dev_dbg(&p->pdev->dev, "bound drc %x to %pR\n", p->drc_index, &p->res); @@ -82,7 +82,7 @@ static int drc_pmem_unbind(struct papr_scm_priv *p) do { rc = plpar_hcall(H_SCM_UNBIND_MEM, ret, p->drc_index, p->bound_addr, p->blocks, token); - token = be64_to_cpu(ret); + token = ret[0]; cond_resched(); } while (rc == H_BUSY); From b0d65a8cbcb097d2110885c3660add97b0125867 Mon Sep 17 00:00:00 2001 From: Oliver O'Halloran Date: Fri, 7 Dec 2018 02:17:12 +1100 Subject: [PATCH 11/14] powerpc/papr_scm: Fix DIMM device registration race When a new nvdimm device is registered with libnvdimm via nvdimm_create() it is added as a device on the nvdimm bus. The probe function for the DIMM driver is potentially quite slow so actually registering and probing the device is done in an async domain rather than immediately after device creation. This can result in a race where the region device (created 2nd) is probed first and fails to activate at boot. To fix this we use the same approach as the ACPI/NFIT driver which is to check that all the DIMM devices registered successfully. LibNVDIMM provides the nvdimm_bus_count_dimms() function which synchronises with the async domain and verifies that the dimm was successfully registered with the bus. If either of these does not occur then we bail. Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/papr_scm.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index 8b5153c55da6..7dc6a751cb4c 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -223,6 +223,9 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p) goto err; } + if (nvdimm_bus_check_dimm_count(p->bus, 1)) + goto err; + /* now add the region */ memset(&mapping, 0, sizeof(mapping)); From 43001c52b603cac041783cc392094ea560bd9444 Mon Sep 17 00:00:00 2001 From: Oliver O'Halloran Date: Fri, 7 Dec 2018 02:17:13 +1100 Subject: [PATCH 12/14] powerpc/papr_scm: Use ibm,unit-guid as the iset cookie The interleave set cookie is used to determine if a label stored in the metadata space should be applied to the current region. This is important in the case of NVDIMMs since the firmware may change the interleaving configuration of a DIMM which would invalidate the existing labels. In our case the hypervisor hides those details from us so we don't really care, but libnvdimm still requires the interleave set cookie to be non-zero. For our purposes we just need the set cookie to be unique and fixed for a given PAPR SCM region and using the unit-guid (really a UUID) is fine for this purpose. Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran [mpe: Use kernel types (u64)] Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/pseries/papr_scm.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index 7dc6a751cb4c..7d6457ab5d34 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -264,6 +264,8 @@ static int papr_scm_probe(struct platform_device *pdev) u32 drc_index, metadata_size; u64 blocks, block_size; struct papr_scm_priv *p; + const char *uuid_str; + u64 uuid[2]; int rc; /* check we have all the required DT properties */ @@ -282,6 +284,11 @@ static int papr_scm_probe(struct platform_device *pdev) return -ENODEV; } + if (of_property_read_string(dn, "ibm,unit-guid", &uuid_str)) { + dev_err(&pdev->dev, "%pOF: missing unit-guid!\n", dn); + return -ENODEV; + } + p = kzalloc(sizeof(*p), GFP_KERNEL); if (!p) return -ENOMEM; @@ -294,6 +301,11 @@ static int papr_scm_probe(struct platform_device *pdev) p->block_size = block_size; p->blocks = blocks; + /* We just need to ensure that set cookies are unique across */ + uuid_parse(uuid_str, (uuid_t *) uuid); + p->nd_set.cookie1 = uuid[0]; + p->nd_set.cookie2 = uuid[1]; + /* might be zero */ p->metadata_size = metadata_size; p->pdev = pdev; From 9ef34630a4614ee1cd478f9859ebea55d55f10ec Mon Sep 17 00:00:00 2001 From: Oliver O'Halloran Date: Fri, 7 Dec 2018 02:17:14 +1100 Subject: [PATCH 13/14] powerpc/mm: Fallback to RAM if the altmap is unusable The "altmap" is used to provide a pool of memory that is reserved for the vmemmap backing of hot-plugged memory. This is useful when adding large amount of ZONE_DEVICE memory to a system with a limited amount of normal memory. On ppc64 we use huge pages to map the vmemmap which requires the backing storage to be contigious and aligned to the hugepage size. The altmap implementation allows for the altmap provider to reserve a few PFNs at the start of the range for it's own uses and when this occurs the first chunk of the altmap is not usable for hugepage mappings. On hash there is no sane way to fall back to a normal sized page mapping so we fail the allocation. This results in memory hotplug failing with ENOMEM when the new range doesn't fall into an existing vmemmap block. This patch handles this case by falling back to using system memory rather than failing if we cannot allocate from the altmap. This fallback should only ever be used for the first vmemmap block so it should not cause excess memory consumption. Fixes: 7b73d978a5d0 ("mm: pass the vmem_altmap to vmemmap_populate") Signed-off-by: Oliver O'Halloran Signed-off-by: Michael Ellerman --- arch/powerpc/mm/init_64.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index 7a9886f98b0c..a5091c034747 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -188,15 +188,20 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, pr_debug("vmemmap_populate %lx..%lx, node %d\n", start, end, node); for (; start < end; start += page_size) { - void *p; + void *p = NULL; int rc; if (vmemmap_populated(start, page_size)) continue; + /* + * Allocate from the altmap first if we have one. This may + * fail due to alignment issues when using 16MB hugepages, so + * fall back to system memory if the altmap allocation fail. + */ if (altmap) p = altmap_alloc_block_buf(page_size, altmap); - else + if (!p) p = vmemmap_alloc_block_buf(page_size, node); if (!p) return -ENOMEM; @@ -255,8 +260,15 @@ void __ref vmemmap_free(unsigned long start, unsigned long end, { unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift; unsigned long page_order = get_order(page_size); + unsigned long alt_start = ~0, alt_end = ~0; + unsigned long base_pfn; start = _ALIGN_DOWN(start, page_size); + if (altmap) { + alt_start = altmap->base_pfn; + alt_end = altmap->base_pfn + altmap->reserve + + altmap->free + altmap->alloc + altmap->align; + } pr_debug("vmemmap_free %lx...%lx\n", start, end); @@ -280,8 +292,9 @@ void __ref vmemmap_free(unsigned long start, unsigned long end, page = pfn_to_page(addr >> PAGE_SHIFT); section_base = pfn_to_page(vmemmap_section_start(start)); nr_pages = 1 << page_order; + base_pfn = PHYS_PFN(addr); - if (altmap) { + if (base_pfn >= alt_start && base_pfn < alt_end) { vmem_altmap_free(altmap, nr_pages); } else if (PageReserved(page)) { /* allocated from bootmem */ From a225f1567405558fb5410e9b2b90805819df1c67 Mon Sep 17 00:00:00 2001 From: Elvira Khabirova Date: Fri, 7 Dec 2018 18:56:05 +0300 Subject: [PATCH 14/14] powerpc/ptrace: replace ptrace_report_syscall() with a tracehook call Arch code should use tracehook_*() helpers, as documented in include/linux/tracehook.h, ptrace_report_syscall() is not expected to be used outside that file. The patch does not look very nice, but at least it is correct and opens the way for PTRACE_GET_SYSCALL_INFO API. Co-authored-by: Dmitry V. Levin Fixes: 5521eb4bca2d ("powerpc/ptrace: Add support for PTRACE_SYSEMU") Signed-off-by: Elvira Khabirova Signed-off-by: Dmitry V. Levin [mpe: Take this as a minimal fix for 4.20, we'll rework it later] Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/ptrace.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c index afb819f4ca68..714c3480c52d 100644 --- a/arch/powerpc/kernel/ptrace.c +++ b/arch/powerpc/kernel/ptrace.c @@ -3266,12 +3266,17 @@ long do_syscall_trace_enter(struct pt_regs *regs) user_exit(); if (test_thread_flag(TIF_SYSCALL_EMU)) { - ptrace_report_syscall(regs); /* + * A nonzero return code from tracehook_report_syscall_entry() + * tells us to prevent the syscall execution, but we are not + * going to execute it anyway. + * * Returning -1 will skip the syscall execution. We want to * avoid clobbering any register also, thus, not 'gotoing' * skip label. */ + if (tracehook_report_syscall_entry(regs)) + ; return -1; }