Commit Graph

9678 Commits

Author SHA1 Message Date
Anton Blanchard
a980349725 powerpc: Call dma_debug_add_bus for PCI and VIO buses
The DMA API debug code has hooks to verify all DMA entries have been
freed at time of hot unplug. We need to call dma_debug_add_bus for
this to work.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:57 +10:00
Anton Blanchard
44b372d8a0 powerpc/vio: Separate vio bus probe and device probe
Similar to PCI, separate the bus probe from device probe. This allows
us to attach bus notifiers for DMA debug and IOMMU fault injection
before devices have been probed.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:54 +10:00
Anton Blanchard
62761d1f68 powerpc/vio: Remove dma not supported warnings
During boot we see a number of these warnings:

vio 30000000: Warning: IOMMU dma not supported: mask 0xffffffffffffffff, table unavailable

The reason for this is that we set IOMMU properties for all VIO
devices even if they are not DMA capable.

Only set DMA ops, table and mask for devices with a DMA window.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:51 +10:00
Michael Neuling
9c41ef086e powerpc/pseries: Fix whitespace in eeh
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:49 +10:00
Anton Blanchard
6da7094810 powerpc/perf: Use perf_instruction_pointer in callchains
We use SIAR or regs->nip for the instruction pointer depending on
the PMU configuration, but we always use regs->nip in the callchain.

Use perf_instruction_pointer so the backtrace is consistent.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:46 +10:00
Anton Blanchard
5c093efa6f powerpc/perf: Always use pt_regs for userspace samples
At the moment we always use the SIAR if the PMU supports continuous
sampling. Unfortunately the SIAR and the PMU exception are not
synchronised for non marked events so we can end up with callchains
that dont make sense.

The following patch checks the HV and PR bits for samples coming from
userspace and always uses pt_regs for them. Userspace will never have
interrupts off so there is no real advantage to using the SIAR for
non marked events in userspace.

I had experimented with a patch that did a similar thing for kernel
samples but we lost a significant amount of information. I was
unable to profile any of our early exception code for example.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:43 +10:00
Anton Blanchard
75382aa72f powerpc/perf: Move code to select SIAR or pt_regs into perf_read_regs
The logic to choose whether to use the SIAR or get the information
out of pt_regs is going to get more complicated, so do it once in
perf_read_regs.

We overload regs->result which is gross but we are already doing it
with regs->dsisr.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:41 +10:00
Anton Blanchard
68b30bb9f0 powerpc/perf: Create mmcra_sihv/mmcra_sipv helpers
We want to access the MMCRA_SIHV and MMCRA_SIPR bits elsewhere so
create mmcra_sihv and mmcra_sipr which hide the differences between
the old and new layout of the bits.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:38 +10:00
Michael Neuling
962cffbd8a powerpc: Enforce usage of RA 0-R31 where possible
Some macros use RA where when RA=R0 the values is 0, so make this
the enforced mnemonic in the macro.

Idea suggested by Andreas Schwab.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:35 +10:00
Michael Neuling
f4c015795c powerpc: Add defines for RA 0-R31
R0 is special since it'll be 0.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:33 +10:00
Michael Neuling
0b7673c35e powerpc: Enforce usage of R0-R31 where possible
Enforce the use of R0-R31 in macros where possible now we have all the
fixes in.

R0-R31 macros are removed here so that can't be used anymore.  They
should not be defined anywhere.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:30 +10:00
Michael Neuling
0972def44f powerpc: Introduce new __REG_R macros
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:27 +10:00
Michael Neuling
cdaade7129 powerpc: Start using ___PPC_RA/B/S/T where necessary
Now have ___PPC_RA/B/S/T we can use it in some places.  These are
places where we can't use the existing defines which will soon enforce
R0-R31 usage.

The macros being changed here are being used in inline asm, which
can't convert to enforce the R0-R31 usage.

bpf_jit uses a mix of both generated and non-generated with the same
code, so just convert all these to use the ___PPC_R versions which
won't enforce R usage later.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:25 +10:00
Michael Neuling
55a5db1846 powerpc: Introduce new ___PPC_RA/B/S/T macros
These are currently the same as __PPC_RA/B/S/T but we'll wrap them
soon.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:22 +10:00
Michael Neuling
178f2ae092 powerpc: Fix VSX macros so register names aren't wrapped
We need to do this so we can enforce the name of a and b in called
macros PPC_RA/B later.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:19 +10:00
Michael Neuling
e55174e911 powerpc: Fixes for instructions not using correct register naming
These macros are using integers where they could be using logical
names since they take registers.

We are going to enforce this soon, so fix these up now.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:16 +10:00
Michael Neuling
03a22bfcfd powerpc: Change LOAD_REG_ADDR to use real register names
LOAD_REG_ADDR define is just a wrapper around real instructions so we
can just use real register names here (ie. lower case).

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:14 +10:00
Michael Neuling
86e32fdce7 powerpc: Change mtcrf to use real register names
mtocrf define is just a wrapper around the real instructions so we can
just use real register names here (ie. lower case).

Also remove braces in macro so this is possible.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:11 +10:00
Benjamin Herrenschmidt
b38c77d82e powerpc: Move and fix MTMSR_EERI definition
Move this duplicated definition to ppc_asm.h and remove the
braces which prevent the use of %rN register names

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:08 +10:00
Michael Neuling
d72be892c8 powerpc: Merge VCPU_GPR
Merge the defines of VCPU_GPR from different places.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:06 +10:00
Michael Neuling
44ce6a5ee7 powerpc: Merge STK_REG/PARAM/FRAMESIZE
Merge the defines of STACKFRAMESIZE, STK_REG, STK_PARAM from different
places.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:03 +10:00
Michael Neuling
4404a9f98f powerpc/pasemi: Move lbz/stbciz to ppc-opcode.h
move lbz/stbciz to ppc-opcode.h.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:18:00 +10:00
Michael Neuling
9a13a524ba powerpc: Convert to %r for all GPR usage
Now all the fixes are in place, let's rock-n-roll!

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:17:58 +10:00
Michael Neuling
c75df6f96c powerpc: Fix usage of register macros getting ready for %r0 change
Anything that uses a constructed instruction (ie. from ppc-opcode.h),
need to use the new R0 macro, as %r0 is not going to work.

Also convert usages of macros where we are just determining an offset
(usually for a load/store), like:
	std	r14,STK_REG(r14)(r1)
Can't use STK_REG(r14) as %r14 doesn't work in the STK_REG macro since
it's just calculating an offset.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:17:55 +10:00
Michael Neuling
564aa5cfd3 powerpc: Modify macro ready for %r0 register change
The assembler doesn't take %r0 register arguments in braces, so remove them.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:17:52 +10:00
Michael Neuling
82fff310f1 powerpc: Add defines for R0-R31
We are going to use these later and convert r0 to %r0 etc.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:17:50 +10:00
Benjamin Herrenschmidt
50bba07d6a Merge branch 'merge' into next
We want to bring in the latest IRQ fixes
2012-07-10 19:16:43 +10:00
Benjamin Herrenschmidt
aa709f3bc9 powerpc/numa: Avoid stupid uninitialized warning from gcc
Newer gcc are being a bit blind here (it's pretty obvious we don't
reach the code path using the array if we haven't initialized the
pointer) but none of that is performance critical so let's just
silence it.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-10 19:16:23 +10:00
Benjamin Herrenschmidt
21b2de3412 powerpc: Fix build of some debug irq code
There was a typo, checking for CONFIG_TRACE_IRQFLAG instead of
CONFIG_TRACE_IRQFLAGS causing some useful debug code to not be
built

This in turns causes a build error on BookE 64-bit due to incorrect
semicolons at the end of a couple of macros, so let's fix that too

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@vger.kernel.org [v3.4]
2012-07-10 19:16:20 +10:00
Benjamin Herrenschmidt
be2cf20a5a powerpc: More fixes for lazy IRQ vs. idle
Looks like we still have issues with pSeries and Cell idle code
vs. the lazy irq state. In fact, the reset fixes that went upstream
are exposing the problem more by causing BUG_ON() to trigger (which
this patch turns into a WARN_ON instead).

We need to be careful when using a variant of low power state that
has the side effect of turning interrupts back on, to properly set
all the SW & lazy state to look as if everything is enabled before
we enter the low power state with MSR:EE off as we will return with
MSR:EE on. If not, we have a discrepancy of state which can cause
things to go very wrong later on.

This patch moves the logic into a helper and uses it from the
pseries and cell idle code. The power4/970 idle code already got
things right (in assembly even !) so I'm not touching it. The power7
"bare metal" idle code is subtly different and correct. Remains PA6T
and some hypervisor based Cell platforms which have questionable
code in there, but they are mostly dead platforms so I'll fix them
when I manage to get final answers from the respective maintainers
about how the low power state actually works on them.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: stable@vger.kernel.org [v3.4]
2012-07-10 19:16:07 +10:00
Scott Wood
a8b91e43af powerpc/mm: remove obsolete comment about page size name array
The array of names in hugetlbpage.c no longer exists.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:50 +10:00
Paul Bolle
09d5472a08 powerpc: Kill flatdevtree_env.h too
Commit 430b01e8f5 ("[POWERPC] Kill
flatdevtree.c") killed the two files including flatdevtree_env.h. It was
apparently just an oversight to not kill that header too. Kill it now.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:50 +10:00
Wanpeng Li
f0a875fd3f powerpc: Fix kernel-doc warning
Warning(arch/powerpc/kernel/pci_of_scan.c:210): Excess function parameter 'node' description in 'of_scan_pci_bridge'
Warning(arch/powerpc/kernel/vio.c:636): No description found for parameter 'desired'
Warning(arch/powerpc/kernel/vio.c:636): Excess function parameter 'new_desired' description in 'vio_cmo_set_dev_desired'
Warning(arch/powerpc/kernel/vio.c:1270): No description found for parameter 'viodrv'
Warning(arch/powerpc/kernel/vio.c:1270): Excess function parameter 'drv' description in '__vio_register_driver'
Warning(arch/powerpc/kernel/vio.c:1289): No description found for parameter 'viodrv'
Warning(arch/powerpc/kernel/vio.c:1289): Excess function parameter 'driver' description in 'vio_unregister_driver'

Signed-off-by: Wanpeng Li <liwp@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:49 +10:00
Bharat Bhushan
a5cb82da78 powerpc: Fix assmption of end_of_DRAM() returns end address
memblock_end_of_DRAM() returns end_address + 1, not end address.
While some code assumes that it returns end address.

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:49 +10:00
Anton Blanchard
cf8fb5533f powerpc: Optimise the 64bit optimised __clear_user
I blame Mikey for this. He elevated my slightly dubious testcase:

to benchmark status. And naturally we need to be number 1 at creating
zeros. So lets improve __clear_user some more.

As Paul suggests we can use dcbz for large lengths. This patch gets
the destination cacheline aligned then uses dcbz on whole cachelines.

Before:
10485760000 bytes (10 GB) copied, 0.414744 s, 25.3 GB/s

After:
10485760000 bytes (10 GB) copied, 0.268597 s, 39.0 GB/s

39 GB/s, a new record.

Signed-off-by: Anton Blanchard <anton@samba.org>
Tested-by: Olof Johansson <olof@lixom.net>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:48 +10:00
Anton Blanchard
b4c3a8729a powerpc/iommu: Implement IOMMU pools to improve multiqueue adapter performance
At the moment all queues in a multiqueue adapter will serialise
against the IOMMU table lock. This is proving to be a big issue,
especially with 10Gbit ethernet.

This patch creates 4 pools and tries to spread the load across
them. If the table is under 1GB in size we revert back to the
original behaviour of 1 pool and 1 largealloc pool.

We create a hash to map CPUs to pools. Since we prefer interrupts to
be affinitised to primary CPUs, without some form of hashing we are
very likely to end up using the same pool. As an example, POWER7
has 4 way SMT and with 4 pools all primary threads will map to the
same pool.

The largealloc pool is reduced from 1/2 to 1/4 of the space to
partially offset the overhead of breaking the table up into pools.

Some performance numbers were obtained with a Chelsio T3 adapter on
two POWER7 boxes, running a 100 session TCP round robin test.

Performance improved 69% with this patch applied.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:48 +10:00
Anton Blanchard
d362213722 powerpc/iommu: Push spinlock into iommu_range_alloc and __iommu_free
In preparation for IOMMU pools, push the spinlock into
iommu_range_alloc and __iommu_free.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:47 +10:00
Anton Blanchard
67ca141567 powerpc/iommu: Reduce spinlock coverage in iommu_free
This patch moves tce_free outside of the lock in iommu_free.

Some performance numbers were obtained with a Chelsio T3 adapter on
two POWER7 boxes, running a 100 session TCP round robin test.

Performance improved 25% with this patch applied.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:47 +10:00
Anton Blanchard
0e4bc95d87 powerpc/iommu: Reduce spinlock coverage in iommu_alloc and iommu_free
We currently hold the IOMMU spinlock around tce_build and tce_flush.
This causes our spinlock hold times to be much higher than required
and can impact multiqueue adapters.

This patch moves tce_build and tce_flush outside of the lock in
iommu_alloc, and tce_flush outside of the lock in iommu_free.

Some performance numbers were obtained with a Chelsio T3 adapter on
two POWER7 boxes, running a 100 session TCP round robin test.

Performance improved 32% with this patch applied.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:47 +10:00
Anton Blanchard
c1703e85a7 powerpc/pseries: Disable interrupts around IOMMU percpu data accesses
tce_buildmulti_pSeriesLP uses a per cpu page to communicate with the
hypervisor. We currently rely on the IOMMU table spinlock but
subsequent patches will be removing that so disable interrupts
around all accesses of tce_page.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:46 +10:00
Anton Blanchard
b3f271e86e powerpc: POWER7 optimised memcpy using VMX and enhanced prefetch
Implement a POWER7 optimised memcpy using VMX and enhanced prefetch
instructions.

This is a copy of the POWER7 optimised copy_to_user/copy_from_user
loop. Detailed implementation and performance details can be found in
commit a66086b819 (powerpc: POWER7 optimised
copy_to_user/copy_from_user using VMX).

I noticed memcpy issues when profiling a RAID6 workload:

	.memcpy
	.async_memcpy
	.async_copy_data
	.__raid_run_ops
	.handle_stripe
	.raid5d
	.md_thread

I created a simplified testcase by building a RAID6 array with 4 1GB
ramdisks (booting with brd.rd_size=1048576):

# mdadm -CR -e 1.2 /dev/md0 --level=6 -n4 /dev/ram[0-3]

I then timed how long it took to write to the entire array:

# dd if=/dev/zero of=/dev/md0 bs=1M

Before: 892 MB/s
After:  999 MB/s

A 12% improvement.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:46 +10:00
Anton Blanchard
bce4b4bd91 powerpc: Use enhanced touch instructions in POWER7 copy_to_user/copy_from_user
Version 2.06 of the POWER ISA introduced enhanced touch instructions,
allowing us to specify a number of attributes including the length of
a stream.

This patch adds a software stream for both loads and stores in the
POWER7 copy_tofrom_user loop. Since the setup is quite complicated
and we have to use an eieio to ensure correct ordering of the "GO"
command we only do this for copies above 4kB.

To quantify any performance improvements we need a working set
bigger than the caches so we operate on a 1GB file:

# dd if=/dev/zero of=/tmp/foo bs=1M count=1024

And we compare how fast we can read the file:

# dd if=/tmp/foo of=/dev/null bs=1M

before: 7.7 GB/s
after:  9.6 GB/s

A 25% improvement.

The worst case for this patch will be a completely L1 cache contained
copy of just over 4kB. We can test this with the copy_to_user
testcase we used to tune copy_tofrom_user originally:

http://ozlabs.org/~anton/junkcode/copy_to_user.c

# time ./copy_to_user2 -l 4224 -i 10000000

before: 6.807 s
after:  6.946 s

A 2% slowdown, which seems reasonable considering our data is unlikely
to be completely L1 contained.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:45 +10:00
Gavin Shan
8127e723da powerpc/pci: cleanup on duplicate assignment
While creating the PCI root bus through function pci_create_root_bus()
of PCI core, it should have assigned the secondary bus number for the
newly created PCI root bus. Thus we needn't do the explicit assignment
for the secondary bus number again in pcibios_scan_phb().

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:45 +10:00
Gavin Shan
5b958a7eef powerpc/numa: Fix OF node refcounting bug
The form affinity for NUMA is set to 1 if the firmware supports
OPAL. Otherwise, we have to retrieve that from OF node "/chosen".
For the latter case, OF node "/chosen" reference count was never
decreased.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:44 +10:00
Anton Blanchard
fde69282b7 powerpc: POWER7 optimised copy_page using VMX and enhanced prefetch
Implement a POWER7 optimised copy_page using VMX and enhanced
prefetch instructions. We use enhanced prefetch hints to prefetch
both the load and store side. We copy a cacheline at a time and
fall back to regular loads and stores if we are unable to use VMX
(eg we are in an interrupt).

The following microbenchmark was used to assess the impact of
the patch:

http://ozlabs.org/~anton/junkcode/page_fault_file.c

We test MAP_PRIVATE page faults across a 1GB file, 100 times:

# time ./page_fault_file -p -l 1G -i 100

Before: 22.25s
After:  18.89s

17% faster

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:44 +10:00
Anton Blanchard
6f7839e542 powerpc: Rename copyuser_power7_vmx.c to vmx-helper.c
Subsequent patches will add more VMX library functions and it makes
sense to keep all the c-code helper functions in the one file.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:43 +10:00
Anton Blanchard
ac1dc36558 powerpc: Clear RI and EE at the same time in system call exit
mtmsrd is an expensive instruction, we save a few cycles by
doing it once instead of twice.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:43 +10:00
Anton Blanchard
a9514dc69d powerpc: Use enhanced touch instructions in POWER7 copy_to_user/copy_from_user
Version 2.06 of the POWER ISA introduced enhanced touch instructions,
allowing us to specify a number of attributes including the length of
a stream.

This patch adds a software stream for both loads and stores in the
POWER7 copy_tofrom_user loop. Since the setup is quite complicated
and we have to use an eieio to ensure correct ordering of the "GO"
command we only do this for copies above 4kB.

To quantify any performance improvements we need a working set
bigger than the caches so we operate on a 1GB file:

# dd if=/dev/zero of=/tmp/foo bs=1M count=1024

And we compare how fast we can read the file:

# dd if=/tmp/foo of=/dev/null bs=1M

before: 7.7 GB/s
after:  9.6 GB/s

A 25% improvement.

The worst case for this patch will be a completely L1 cache contained
copy of just over 4kB. We can test this with the copy_to_user
testcase we used to tune copy_tofrom_user originally:

http://ozlabs.org/~anton/junkcode/copy_to_user.c

# time ./copy_to_user2 -l 4224 -i 10000000

before: 6.807 s
after:  6.946 s

A 2% slowdown, which seems reasonable considering our data is unlikely
to be completely L1 contained.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:42 +10:00
Yong Zhang
e250d4bca6 powerpc/smp: remove call to ipi_call_lock()/ipi_call_unlock()
1) call_function.lock used in smp_call_function_many() is just to protect
   call_function.queue and &data->refs, cpu_online_mask is outside of the
   lock. And it's not necessary to protect cpu_online_mask,
   because data->cpumask is pre-calculate and even if a cpu is brougt up
   when calling arch_send_call_function_ipi_mask(), it's harmless because
   validation test in generic_smp_call_function_interrupt() will take care
   of it.

2) For cpu down issue, stop_machine() will guarantee that no concurrent
   smp_call_fuction() is processing.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:42 +10:00
Anton Blanchard
17968fbbd1 powerpc: 64bit optimised __clear_user
I noticed __clear_user high up in a profile of one of my RAID stress
tests. The testcase was doing a dd from /dev/zero which ends up
calling __clear_user.

__clear_user is basically a loop with a single 4 byte store which
is horribly slow. We can do much better by aligning the desination
and doing 32 bytes of 8 byte stores in a loop.

The following testcase was used to verify the patch:

http://ozlabs.org/~anton/junkcode/stress_clear_user.c

To show the improvement in performance I ran a dd from /dev/zero
to /dev/null on a POWER7 box:

Before:

# dd if=/dev/zero of=/dev/null bs=1M count=10000
10485760000 bytes (10 GB) copied, 3.72379 s, 2.8 GB/s

After:

# time dd if=/dev/zero of=/dev/null bs=1M count=10000
10485760000 bytes (10 GB) copied, 0.728318 s, 14.4 GB/s

Over 5x faster.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:41 +10:00