Commit Graph

519238 Commits

Author SHA1 Message Date
Michael Neuling
b12994fbfe cxl: cxl_afu_reset() -> __cxl_afu_reset()
Rename cxl_afu_reset() to __cxl_afu_reset() to we can reuse this function name
in the API.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:18 +10:00
Michael Neuling
eda3693c84 cxl: Rework detach context functions
Rework __detach_context() and cxl_context_detach() so we can reuse them in the
kernel API.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:18 +10:00
Michael Neuling
6428832a7b cxl: Add cookie parameter to afu_release_irqs()
Add cookie parameter to afu_release_irqs() so that we can pass in a different
cookie than the context structure.  This will be useful for other kernel
drivers that want to call this but get their own cookie back in the interrupt
handler.

Update all existing call sites.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:17 +10:00
Michael Neuling
bfcdc8fffe cxl: Dump debug info on the AFU configuration record
Now that we parse the AFU Configuration record, dump some info on it when in
debug mode.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:17 +10:00
Michael Neuling
7f436b534d cxl: Fix error path on probe
When probing we call pci_enable_device() but don't call pci_disable_device() on
fail. This causes refcounting issues in the PCI subsystem if a second driver
tries to bind to the same device.

This patch adds the pci_disable_device() to the probe error path. This error
path is hit when this cxl driver tries to bind to AFUs (on the vPHB) rather
than the physical device.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:17 +10:00
Ian Munsie
bee30c7045 cxl: Re-order card init to check the VSEC earlier
When we expose AFUs as virtual PCI devices, they may look like the physical
CAPI PCI card.  ie they may have the same vendor/device IDs.

We want to avoid these AFUs binding to this driver and any init this driver may
do.

Re-order card init to check the VSEC earlier before assigning BARs or
activating CXL.  Also change the dev used in early prints as the adapter struct
may not be inited at this earlier stage.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:17 +10:00
Michael Neuling
69c3a73c81 cxl: Remove unnecessarily verbose print in cxl_remove()
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:17 +10:00
Michael Neuling
aa70775e9a cxl: Add shutdown hook
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:16 +10:00
Michael Neuling
aee85fb6ba cxl: Document external user of existing API
Now that libcxl is public, let's document it.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:16 +10:00
Michael Neuling
abeeed6d3d powerpc/pci: Add pcibios_disable_device() hook
This adds a hook into the powerpc pci code for pci_disable_device() calls.  The
generic code already provides a weak pcibios_disable_device() symbol, so we
just need to provide our own in powerpc and it'll get picked up.

This is passed directly to the phb controller ops, provided one exists.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:16 +10:00
Michael Neuling
7a8e6bbf85 powerpc/pci: Add shutdown hook to pci_controller_ops
Currently pnv_pci_shutdown() calls the PHB shutdown code for all PHBs in the
system.  It dereferences the private_data assuming it's a powernv PHB, which
won't be the case when we have different PHB in the systems (like when we add
vPHBs for CXL).

This moves the shutdown hook to the pci_controller_ops and fixes the call site
to use that instead.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:16 +10:00
Michael Neuling
f46580a5cf powerpc: Add cxl context to device archdata
Add cxl context pointer to archdata.  We'll want to create one of these for cxl
PCI devices.  Put them here until we can get a pci_dev specific private data.

This location was suggested by benh.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:16 +10:00
Michael Neuling
10e796309a powerpc/pci: Add release_device() hook to phb ops
Add release_device() hook to phb ops so we can clean up for specific phbs.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:15 +10:00
Daniel Axtens
5b64d2cc41 powerpc/pci: Export symbols for CXL
Export pcibios_claim_one_bus, pcibios_scan_phb and pcibios_alloc_controller.

These will be used by the CXL driver.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:15 +10:00
Michael Neuling
85a97da958 powerpc/copro: Fix faulting kernel segments
This fixes calculating the key bits (KP and KS) in the SLB VSID for kernel
mappings.

I'm not CCing this to stable as there are no uses of this currently.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:15 +10:00
Ian Munsie
8ac75b96be cxl: Use call_rcu to reduce latency when releasing the afu fd
The afu fd release path was identified as a significant bottleneck in
the overall performance of cxl. While an optimal AFU design would
minimise the need to close & reopen the AFU fd, it is not always
practical to avoid.

The bottleneck seems to be down to the call to synchronize_rcu(), which
will block until every other thread is guaranteed to be out of an RCU
critical section. Replace it with call_rcu() to free the context
structures later so we can return to the application sooner.

This reduces the time spent in the fd release path from 13356 usec to
13.3 usec - about a 100x speed up.

Reported-by: Fei K Chen <uchen@cn.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:15 +10:00
Vaibhav Jain
e36f6fe1f7 cxl: Export AFU error buffer via sysfs
Export the "AFU Error Buffer" via sysfs attribute (afu_err_buf). AFU
error buffer is used by the AFU to report application specific
errors. The contents of this buffer are AFU specific and are intended to
be interpreted by the application interacting with the afu.

Suggested-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:15 +10:00
Vaibhav Jain
27d4dc7116 cxl: Implement an ioctl to fetch afu card-id, offset-id and mode
Given a file descriptor on an afu device, libcxl currently uses the
major/minor number obtained from fstat on the fd to construct path to
the afu's sysfs directory. However it is possible that rather than using
one of the device in /dev/cxl, a kernel driver creates its own device
which export generic cxl interface to the userspace. This causes
problems with libcxl as it tries to use a wrong major/minor number to
construct the sysfs path and fail.

So this patch introduces a new ioctl called CXL_IOCTL_GET_AFU_ID on the
afu file descriptor to fetch the cxl_afu_id struct that holds the
card/offset-id and mode information. These info is then used by libcxl to
construct the correct path to the afu sysfs directory.

Testing:
	- Build against pseries be/le configs
	- Testing with corresponding libcxl changes to verify that it constructs
	  right sysfs path to the afu.

Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:15 +10:00
Michael Ellerman
989898b707 selftests/powerpc: Add install support to more powerpc tests
These tests were merged in parallel to the install support, update them
now to use it.

This also adds cross compile support for the VPHN test which was missing
it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 16:54:49 +10:00
Cyril Bur
ea4d1a87e6 powerpc/configs: Replace pseries_le_defconfig with a Makefile target using merge_config
Rather than continuing to maintain a copy of pseries_defconfig with
CONFIG_CPU_LITTLE_ENDIAN enabled, use the generic merge_config script
and use an le.config to enable little endian on top of pseries_defconfig
without the need for a duplicated _defconfig file.

This method will require less maintenance in the future and will ensure
that both 'defconfigs' are always in sync.

It is worth noting that the seemingly more simple approach of:

  pseries_le_defconfig: pseries_defconfig
  	$(Q)$(MAKE) le.config

Will not work when building using O=builddir.

The obvious fix to that:

  pseries_le_defconfig:
  	$(Q)$(MAKE) -f $(srctree)/Makefile pseries_defconfig le.config

Also does not work. This is because if we have for example:

config FOO
	depends on CPU_BIG_ENDIAN
	select BAR

Then BAR will be enabled by the first call to kconfig (via
pseries_defconfig), and then will remain enabled after we merge
le.config, even though FOO will have been turned off.

The solution is to ensure to only invoke the kconfig logic once, after
we have merged all the config fragments. This ensures nothing is
select'ed on that should then be disabled by the later merged configs.
This is done through the explicit call to make olddefconfig

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Reviewed-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
[mpe: Massage change log, fix white space and use ARCH not SRCARCH]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 16:54:49 +10:00
Cyril Bur
a1c97df278 powerpc/configs: Merge pseries_defconfig and pseries_le_defconfig
These two configs should be identical with the exception of big or little
endian.

The big endian version has XMON_DEFAULT turned on while the little endian
has XMON_DEFAULT not set. It makes the most sense for defconfigs not to use
xmon by default, production systems should get back up as quickly as
possible, not sit in xmon.

In the event debugging is required, the option can be enabled or xmon=on
can be specified on commandline.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 16:54:48 +10:00
Jiang Liu
c1231a784a powerpc: Use irq_desc_get_xxx() to avoid redundant lookup of irq_desc
Use irq_desc_get_xxx() to avoid redundant lookup of irq_desc while we
already have a pointer to corresponding irq_desc.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 16:54:44 +10:00
Anton Blanchard
d20be433e6 powerpc: Non relocatable system call doesn't need a trampoline
We need to use a trampoline when using LOAD_HANDLER(), because the
destination needs to be in the first 64kB. An absolute branch has
no such limitations, so just jump there.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 13:26:47 +10:00
Anton Blanchard
05b05f28fb powerpc: Relocatable system call no longer uses the LR
We had some code to restore the LR in the relocatable system call path
back when we used the LR to do an indirect branch.

Commit 6a404806df ("powerpc: Avoid link stack corruption in MMU
on syscall entry path") changed this to use the CTR which is volatile
across system calls so does not need restoring.

Remove the stale comment and the restore of the LR.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 13:26:47 +10:00
Anton Blanchard
72e349f112 powerpc/perf: Fix book3s kernel to userspace backtraces
When we take a PMU exception or a software event we call
perf_read_regs(). This overloads regs->result with a boolean that
describes if we should use the sampled instruction address register
(SIAR) or the regs.

If the exception is in kernel, we start with the kernel regs and
backtrace through the kernel stack. At this point we switch to the
userspace regs and backtrace the user stack with perf_callchain_user().

Unfortunately these regs have not got the perf_read_regs() treatment,
so regs->result could be anything. If it is non zero,
perf_instruction_pointer() decides to use the SIAR, and we get issues
like this:

0.11%  qemu-system-ppc  [kernel.kallsyms]        [k] _raw_spin_lock_irqsave
       |
       ---_raw_spin_lock_irqsave
          |
          |--52.35%-- 0
          |          |
          |          |--46.39%-- __hrtimer_start_range_ns
          |          |          kvmppc_run_core
          |          |          kvmppc_vcpu_run_hv
          |          |          kvmppc_vcpu_run
          |          |          kvm_arch_vcpu_ioctl_run
          |          |          kvm_vcpu_ioctl
          |          |          do_vfs_ioctl
          |          |          sys_ioctl
          |          |          system_call
          |          |          |
          |          |          |--67.08%-- _raw_spin_lock_irqsave <--- hi mum
          |          |          |          |
          |          |          |           --100.00%-- 0x7e714
          |          |          |                     0x7e714

Notice the bogus _raw_spin_irqsave when we transition from kernel
(system_call) to userspace (0x7e714). We inserted what was in the SIAR.

Add a check in regs_use_siar() to check that the regs in question
are from a PMU exception. With this fix the backtrace makes sense:

     0.47%  qemu-system-ppc  [kernel.vmlinux]         [k] _raw_spin_lock_irqsave
            |
            ---_raw_spin_lock_irqsave
               |
               |--53.83%-- 0
               |          |
               |          |--44.73%-- hrtimer_try_to_cancel
               |          |          kvmppc_start_thread
               |          |          kvmppc_run_core
               |          |          kvmppc_vcpu_run_hv
               |          |          kvmppc_vcpu_run
               |          |          kvm_arch_vcpu_ioctl_run
               |          |          kvm_vcpu_ioctl
               |          |          do_vfs_ioctl
               |          |          sys_ioctl
               |          |          system_call
               |          |          __ioctl
               |          |          0x7e714
               |          |          0x7e714

Cc: stable@vger.kernel.org
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 13:26:38 +10:00
Michael Ellerman
09f3f326cd powerpc/mm: Fix build break with STRICT_MM_TYPECHECKS && DEBUG_PAGEALLOC
If both STRICT_MM_TYPECHECKS and DEBUG_PAGEALLOC are enabled, the code
in kernel_map_linear_page() is built, and so we fail with:

  arch/powerpc/mm/hash_utils_64.c:1478:2:
  error: incompatible type for argument 1 of 'htab_convert_pte_flags'

Fix it by using pgprot_val().

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 13:24:48 +10:00
Daniel Axtens
763d2d8df1 powerpc/powernv: Move dma_set_mask() from pnv_phb to pci_controller_ops
Previously, dma_set_mask() on powernv was convoluted:
 0) Call dma_set_mask() (a/p/kernel/dma.c)
 1) In dma_set_mask(), ppc_md.dma_set_mask() exists, so call it.
 2) On powernv, that function pointer is pnv_dma_set_mask().
    In pnv_dma_set_mask(), the device is pci, so call pnv_pci_dma_set_mask().
 3) In pnv_pci_dma_set_mask(), call pnv_phb->set_dma_mask() if it exists.
 4) It only exists in the ioda case, where it points to
    pnv_pci_ioda_dma_set_mask(), which is the final function.

So the call chain is:
 dma_set_mask() ->
  pnv_dma_set_mask() ->
   pnv_pci_dma_set_mask() ->
    pnv_pci_ioda_dma_set_mask()

Both ppc_md and pnv_phb function pointers are used.

Rip out the ppc_md call, pnv_dma_set_mask() and pnv_pci_dma_set_mask().

Instead:
 0) Call dma_set_mask() (a/p/kernel/dma.c)
 1) In dma_set_mask(), the device is pci, and pci_controller_ops.dma_set_mask()
    exists, so call pci_controller_ops.dma_set_mask()
 2) In the ioda case, that points to pnv_pci_ioda_dma_set_mask().

The new call chain is
 dma_set_mask() ->
  pnv_pci_ioda_dma_set_mask()

Now only the pci_controller_ops function pointer is used.

The fallback paths for p5ioc2 are the same.

Previously, pnv_pci_dma_set_mask() would find no pnv_phb->set_dma_mask()
function, to it would call __set_dma_mask().

Now, dma_set_mask() finds no ppc_md call or pci_controller_ops call,
so it calls __set_dma_mask().

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 13:18:49 +10:00
Daniel Axtens
3405c2570f powerpc/pci: add dma_set_mask to pci_controller_ops
Some systems only need to deal with DMA masks for PCI devices.
For these systems, we can avoid the need for a platform hook and
instead use a pci controller based hook.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 13:18:49 +10:00
Daniel Axtens
92ae035326 powerpc/powernv: Specialise pci_controller_ops for each controller type
Remove powernv generic PCI controller operations. Replace it with
controller ops for each of the two supported PHBs.

As an added bonus, make the two new structs const, which will help
guard against bugs such as the one introduced in 65ebf4b63
("powerpc/powernv: Move controller ops from ppc_md to controller_ops")

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 13:18:49 +10:00
Daniel Axtens
1f88d5860e powerpc: Remove MSI-related PCI controller ops from ppc_md
Remove unneeded ppc_md functions. Patch callsites to use pci_controller_ops
functions exclusively.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 11:47:45 +10:00
Daniel Axtens
14f95acda2 powerpc/mpic_u3msi: Move MSI-related ops to pci_controller_ops
Move the u3 MPIC msi subsystem to use the pci_controller_ops structure
rather than ppc_md for MSI related PCI controller operations.

As with fsl_msi, operations are plugged in at the subsys level, after
controller creation. Again, we iterate over all controllers and
populate them with the MSI ops.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 11:47:44 +10:00
Daniel Axtens
8392296697 powerpc/pasemi: Move MSI-related ops to pci_controller_ops
Move the PaSemi MPIC msi subsystem to use the pci_controller_ops
structure rather than ppc_md for MSI related PCI controller
operations.

As with fsl_msi, operations are plugged in at the subsys level, after
controller creation. Again, we iterate over all controllers and
populate them with the MSI ops.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 11:47:43 +10:00
Daniel Axtens
f2c800aace powerpc/ppc4xx_hsta_msi: Move MSI-related ops to pci_controller_ops
Move the ppc4xx hsta msi subsystem to use the pci_controller_ops
structure rather than ppc_md for MSI related PCI controller
operations.

As with fsl_msi, operations are plugged in at the subsys level, after
controller creation. Again, we iterate over all controllers and
populate them with the MSI ops.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 11:47:43 +10:00
Daniel Axtens
b7eba2ffcc powerpc/ppc4xx_msi: Move MSI-related ops to pci_controller_ops
Move the ppc4xx msi subsystem to use the pci_controller_ops structure
rather than ppc_md for MSI related PCI controller operations.

As with fsl_msi, operations are plugged in at the subsys level, after
controller creation. Again, we iterate over all controllers and
populate them with the MSI ops.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 11:47:42 +10:00
Daniel Axtens
00e2539703 powerpc/fsl_msi: Move MSI-related ops to pci_controller_ops
Move the fsl_msi subsystem to use the pci_controller_ops structure
rather than ppc_md for MSI related PCI controller operations.

Previously, MSI ops were added to ppc_md at the subsys level. However,
in fsl_pci.c, PCI controllers are created at the at arch level. So,
unlike in e.g. PowerNV/pSeries/Cell, we can't simply populate a
platform-level controller ops structure and have it copied into the
controllers when they are created.

Instead, walk every phb, and attempt to populate it with the MSI ops.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 11:47:42 +10:00
Daniel Axtens
1d14b8755f powerpc/pseries: Move MSI-related ops to pci_controller_ops
Move the pseries platform to use the pci_controller_ops structure
rather than ppc_md for MSI related PCI controller operations

We need to iterate all PHBs because the MSI setup happens later than
find_and_init_phbs() - mpe.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-02 11:47:10 +10:00
Daniel Axtens
7e3d6c5a4b powerpc/cell: Move MSI-related ops to pci_controller_ops
Move the Cell platform to use the pci_controller_ops structure rather
than ppc_md for MSI related PCI controller operations.

We can be confident that the functions will be added to the platform's
ops struct before any PCI controller's ops struct is populated
because:

1) These ops are added to the struct in a subsys initcall.

We populate the ops in axon_msi_probe, which is the probe call for the
axon-msi driver. However the driver is registered in axon_msi_init,
which is a subsys initcall, so this will happen at the subsys level.

2) The controller recieves the struct later, in a device initcall.

Cell populates the controller in cell_setup_phb, which is hooked up to
ppc_md.pci_setup_phb. ppc_md.pci_setup_phb is only ever called in
of_platform.c, as part of the OpenFirmware PCI driver's probe
routine. That driver is registered in a device initcall, so it will
occur *after* the struct is properly populated.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:50:55 +10:00
Daniel Axtens
d6381119a4 powerpc/powernv: Move MSI-related ops to pci_controller_ops
Move the PowerNV/BML platform to use the pci_controller_ops structure
rather than ppc_md for MSI related PCI controller operations.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:50:55 +10:00
Daniel Axtens
e059b105d1 powerpc: Add MSI operations to pci_controller_ops struct
Add MSI setup and teardown functions to pci_controller_ops.

Patch the callsites (arch_{setup,teardown}_msi_irqs) to prefer the
controller ops version if it's available.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:50:55 +10:00
Alistair Popple
81f2f7ce4c opal: Remove events notifier
All users of the old opal events notifier have been converted over to
the irq domain so remove the event notifier functions.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:14:38 +10:00
Alistair Popple
8034f715f0 powernv/opal-dump: Convert to irq domain
Convert the opal dump driver to the new opal irq domain.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:14:38 +10:00
Alistair Popple
74159a7028 powernv/elog: Convert elog to opal irq domain
This patch converts the elog code to use the opal irq domain instead
of notifier events.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:14:38 +10:00
Alistair Popple
a295af24d0 powernv/opal: Convert opal message events to opal irq domain
This patch converts the opal message event to use the new opal irq
domain.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:14:38 +10:00
Alistair Popple
79231448c9 powernv/eeh: Update the EEH code to use the opal irq domain
The eeh code currently uses the old notifier method to get eeh events
from OPAL. It also contains some logic to filter opal events which has
been moved into the virtual irqchip. This patch converts the eeh code
to the new event interface which simplifies event handling.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:14:38 +10:00
Alistair Popple
2def86a720 hvc: Convert to using interrupts instead of opal events
Convert the opal hvc driver to use the new irqchip to register for
opal events. As older firmware versions may not have device tree
bindings for the interrupt parent we just use a hardcoded hwirq based
on the event number.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:14:38 +10:00
Alistair Popple
dce143c338 ipmi/powernv: Convert to irq event interface
Convert the opal ipmi driver to use the new irq interface for events.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Acked-by: Corey Minyard <cminyard@mvista.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: openipmi-developer@lists.sourceforge.net
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:14:37 +10:00
Alistair Popple
9f0fd0499d powerpc/powernv: Add a virtual irqchip for opal events
Whenever an interrupt is received for opal the linux kernel gets a
bitfield indicating certain events that have occurred and need handling
by the various device drivers. Currently this is handled using a
notifier interface where we call every device driver that has
registered to receive opal events.

This approach has several drawbacks. For example each driver has to do
its own checking to see if the event is relevant as well as event
masking. There is also no easy method of recording the number of times
we receive particular events.

This patch solves these issues by exposing opal events via the
standard interrupt APIs by adding a new interrupt chip and
domain. Drivers can then register for the appropriate events using
standard kernel calls such as irq_of_parse_and_map().

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:14:37 +10:00
Alistair Popple
96e023e753 powerpc/powernv: Reorder OPAL subsystem initialisation
Most of the OPAL subsystems are always compiled in for PowerNV and
many of them need to be initialised before or after other OPAL
subsystems. Rather than trying to control this ordering through
machine initcalls it is clearer and easier to control initialisation
order with explicit calls in opal_init.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Cc: Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:14:37 +10:00
Shreyas B. Prabhu
5703d2f4a1 powerpc/powernv: Introduce sysfs control for fastsleep workaround behavior
Fastsleep is one of the idle state which cpuidle subsystem currently
uses on power8 machines. In this state L2 cache is brought down to a
threshold voltage. Therefore when the core is in fastsleep, the
communication between L2 and L3 needs to be fenced. But there is a bug
in the current power8 chips surrounding this fencing.

OPAL provides a workaround which precludes the possibility of hitting
this bug. But running with this workaround applied causes checkstop
if any correctable error in L2 cache directory is detected. Hence OPAL
also provides a way to undo the workaround.

In the existing implementation, workaround is applied by the last thread
of the core entering fastsleep and undone by the first thread waking up.
But this has a performance cost. These OPAL calls account for roughly
4000 cycles everytime the core has to enter or wakeup from fastsleep.

This patch introduces a sysfs attribute (fastsleep_workaround_applyonce)
to choose the behavior of this workaround.

By default, fastsleep_workaround_applyonce = 0. In this case, workaround
is applied/undone everytime the core enters/exits fastsleep.

fastsleep_workaround_applyonce = 1. In this case the workaround is
applied once on all the cores and never undone. This can be triggered by
echo 1 > /sys/devices/system/cpu/fastsleep_workaround_applyonce

For simplicity this attribute can be modified only once. Implying, once
fastsleep_workaround_applyonce is changed to 1, it cannot be reverted
to the default state.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:12:30 +10:00
Shreyas B. Prabhu
d405a98c70 powerpc/powernv: Move cpuidle related code from setup.c to new file
This is a cleanup patch; doesn't change any functionality. Moves
all cpuidle related code from setup.c to a new file.

Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
[mpe: Fix the SMP=n build by including asm/smp.h in idle.c]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-05-22 15:12:30 +10:00