linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-08 21:21:47 +00:00

Author	SHA1	Message	Date
David S. Miller	9f825962ef	sparc64: Niagara-4 bzero/memset, plus use MRU stores in page copy. This adds optimized memset/bzero/page-clear routines for Niagara-4. We basically can do what powerpc has been able to do for a decade (via the "dcbz" instruction), which is use cache line clearing stores for bzero and memsets with a 'c' argument of zero. As long as we make the cache initializing store to each 32-byte subblock of the L2 cache line, it works. As with other Niagara-4 optimized routines, the key is to make sure to avoid any usage of the %asi register, as reads and writes to it cost at least 50 cycles. For the user clear cases, we don't use these new routines, we use the Niagara-1 variants instead. Those have to use %asi in an unavoidable way. A Niagara-4 8K page clear costs just under 600 cycles. Add definitions of the MRU variants of the cache initializing store ASIs. By default, cache initializing stores install the line as Least Recently Used. If we know we're going to use the data immediately (which is true for page copies and clears) we can use the Most Recently Used variant, to decrease the likelyhood of the lines being evicted before they get used. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-05 13:45:26 -07:00
David S. Miller	954f9ac43b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux There's a Niagara 2 memcpy fix in this tree and I have a Kconfig fix from Dave Jones which requires the sparc-next changes which went upstream yesterday. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-02 23:02:10 -04:00
David S. Miller	42a4172b6e	sparc64: Fix trailing whitespace in NG4 memcpy. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-28 13:08:22 -07:00
David S. Miller	9019205732	sparc64: Fix comment type in NG4 copy from user. Noticed by Greg Onufer. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 14:26:41 -07:00
David S. Miller	1b62ca7bf5	sparc64: Fix return value of Niagara-2 memcpy. It gets clobbered by the kernel's VISEntryHalf, so we have to save it in a different register than the set clobbered by that macro. The instance in glibc is OK and doesn't have this problem. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 01:06:43 -07:00
David S. Miller	ae2c6ca641	sparc64: Add SPARC-T4 optimized memcpy. Before After -------------- -------------- bw_tcp: 1288.53 MB/sec 1637.77 MB/sec bw_pipe: 1517.18 MB/sec 2107.61 MB/sec bw_unix: 1838.38 MB/sec 2640.91 MB/sec make -s -j128 allmodconfig 5min 49sec 5min 31sec Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-27 00:35:11 -07:00
David S. Miller	4ff28d4ca9	sparc64: Add SHA1 driver making use of the 'sha1' instruction. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Herbert Xu <herbert@gondor.apana.org.au>	2012-08-20 15:08:49 -07:00
David S. Miller	6f1d827f29	sparc64: Consistently use fsrc2 rather than fmovd in optimized asm. Because fsrc2, unlike fmovd, does not update the %fsr register. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-27 01:25:23 -07:00
David Miller	2c66f62363	sparc: use the new generic strnlen_user() function This throws away the sparc-specific functions in favor of the generic optimized version. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-26 11:33:54 -07:00
David S. Miller	2922585b93	lib: Sparc's strncpy_from_user is generic enough, move under lib/ To use this, an architecture simply needs to: 1) Provide a user_addr_max() implementation via asm/uaccess.h 2) Add "select GENERIC_STRNCPY_FROM_USER" to their arch Kcnfig 3) Remove the existing strncpy_from_user() implementation and symbol exports their architecture had. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: David Howells <dhowells@redhat.com>	2012-05-24 13:12:28 -07:00
David S. Miller	446969084d	kernel: Move REPEAT_BYTE definition into linux/kernel.h And make sure that everything using it explicitly includes that header file. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-24 13:10:05 -07:00
David S. Miller	35c9646062	sparc: Increase portability of strncpy_from_user() implementation. Hide details of maximum user address calculation in a new asm/uaccess.h interface named user_addr_max(). Provide little-endian implementation in find_zero(), which should work but can probably be improved. Abstrace alignment check behind IS_UNALIGNED() macro. Kill double-semicolon, noticed by David Howells. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-24 13:04:24 -07:00
David S. Miller	4efcac3a24	sparc: Optimize strncpy_from_user() zero byte search. Compute a mask that will only have 0x80 in the bytes which had a zero in them. The formula is: ~(((x & 0x7f7f7f7f) + 0x7f7f7f7f) \| x \| 0x7f7f7f7f) In the inner word iteration, we have to compute the "x \| 0x7f7f7f7f" part, so we can reuse that in the above calculation. Once we have this mask, we perform divide and conquer to find the highest 0x80 location. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-23 19:20:20 -07:00
David S. Miller	ff06dffbc8	sparc: Add full proper error handling to strncpy_from_user(). Linus removed the end-of-address-space hackery from fs/namei.c:do_getname() so we really have to validate these edge conditions and cannot cheat any more (as x86 used to as well). Move to a common C implementation like x86 did. And if both src and dst are sufficiently aligned we'll do word at a time copies and checks as well. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-22 23:32:27 -07:00
David S. Miller	74c7b28953	sparc32: Add ucmpdi2.o to obj-y instead of lib-y. Otherwise if no references exist in the static kernel image, we won't export the symbol properly to modules. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-19 15:27:01 -07:00
Sam Ravnborg	de36e66d5f	sparc32: add ucmpdi2 Based on copy from microblaze add ucmpdi2 implementation. This fixes build of niu driver which failed with: drivers/built-in.o: In function `niu_get_nfc': niu.c:(.text+0x91494): undefined reference to `__ucmpdi2' This driver will never be used on a sparc32 system, but patch added to fix build breakage with all*config builds. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-19 15:23:57 -07:00
David S. Miller	1b35a57b1c	sparc32: Kill off software 32-bit multiply/divide routines. For the explicit calls to .udiv/.umul in assembler, I made a mechanical (read as: safe) transformation. I didn't attempt to make any simplifications. In particular, __ndelay and __udelay can be simplified significantly. Some of the %y reads are unnecessary and these routines have no need any longer for allocating a register window, they can be leaf functions. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-15 11:23:47 -07:00
David S. Miller	73c1377da9	sparc32: Kill btfixup for xchg()'s 'swap' instruction. We always have this instruction available, so no need to use btfixup for it any more. This also eradicates the whole of atomic_32.S and thus the __atomic_begin and __atomic_end symbols completely. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-13 13:07:16 -07:00
David S. Miller	8695c37d06	sparc: Convert some assembler over to linakge.h's ENTRY/ENDPROC Use those, instead of doing it all by hand. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-11 20:33:22 -07:00
David S. Miller	b55e81b9f8	sparc32: Remove inline strncmp "optimization" for constant counts. Let the compiler do stuff like this. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-11 19:53:29 -07:00
Sam Ravnborg	593fc6ea47	sparc32: drop sun4c specific ___xchg32 implementation Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-11 19:27:47 -07:00
David Miller	c6df4b17c8	lib: Fix multiple definitions of clz_tab Both sparc 32-bit's software divide assembler and MPILIB provide clz_tab[] with identical contents. Break it out into a seperate object file and select it when SPARC32 or MPILIB is set. Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: James Morris <jmorris@namei.org>	2012-02-02 10:34:23 +11:00
Linus Torvalds	e343a895a9	lib: use generic pci_iomap on all architectures Many architectures don't want to pull in iomap.c, so they ended up duplicating pci_iomap from that file. That function isn't trivial, and we are going to modify it https://lkml.org/lkml/2011/11/14/183 so the duplication hurts. This reduces the scope of the problem significantly, by moving pci_iomap to a separate file and referencing that from all architectures. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJPBZXBAAoJECgfDbjSjVRpuuYIAIMD0wE96MuTOSBJX4VG8VAP UyjL9dsfMRy8CKioQo5/fxpTY07YBCWmNauSSX7pzgcoUKBfYIGn4Z1qwGYsWK9M CzLs6PXLTugw0FtKobHZl/klRTWEBS6YOUjp9x568rplwF+Ppk7b993uj7eS/g+e T0mUKzqg4/UavbHd9+W5KgC4drQ5hgtu2WZHoUxBK4umnd3C2G+U82Sthg50o/XU SC8IGm39K8I36HoIWgXj3Y7nkOP3mQELohOT4ZPiVSmLvGS4i47+ix75anO+8ZvZ jxHr8RC85IK1Nd89NZhbKOyvx0QQiwoKUZaTwcWXJNSOADzZnM6icdIsodc+Elo= =ccQZ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost lib: use generic pci_iomap on all architectures Many architectures don't want to pull in iomap.c, so they ended up duplicating pci_iomap from that file. That function isn't trivial, and we are going to modify it https://lkml.org/lkml/2011/11/14/183 so the duplication hurts. This reduces the scope of the problem significantly, by moving pci_iomap to a separate file and referencing that from all architectures. * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: alpha: drop pci_iomap/pci_iounmap from pci-noop.c mn10300: switch to GENERIC_PCI_IOMAP mn10300: add missing __iomap markers frv: switch to GENERIC_PCI_IOMAP tile: switch to GENERIC_PCI_IOMAP tile: don't panic on iomap sparc: switch to GENERIC_PCI_IOMAP sh: switch to GENERIC_PCI_IOMAP powerpc: switch to GENERIC_PCI_IOMAP parisc: switch to GENERIC_PCI_IOMAP mips: switch to GENERIC_PCI_IOMAP microblaze: switch to GENERIC_PCI_IOMAP arm: switch to GENERIC_PCI_IOMAP alpha: switch to GENERIC_PCI_IOMAP lib: add GENERIC_PCI_IOMAP lib: move GENERIC_IOMAP to lib/Kconfig Fix up trivial conflicts due to changes nearby in arch/{m68k,score}/Kconfig	2012-01-10 18:04:27 -08:00
Sam Ravnborg	348738afe5	sparc32: drop unused atomic24 support atomic24 support was used to semaphores in the past - but is no longer used. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-27 14:11:40 -05:00
Michael S. Tsirkin	a21a2fd403	sparc: switch to GENERIC_PCI_IOMAP sparc copied pci_iomap from generic code, probably to avoid pulling the rest of iomap.c in. Since that's in a separate file now, we can reuse the common implementation. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2011-12-04 15:59:49 +02:00
David S. Miller	a52312b88c	sparc32: Correct the return value of memcpy. Properly return the original destination buffer pointer. Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Kjetil Oftedal <oftedal@gmail.com>	2011-10-20 15:17:23 -07:00
David S. Miller	21f74d361d	sparc32: Remove uses of %g7 in memcpy implementation. This is setting things up so that we can correct the return value, so that it properly returns the original destination buffer pointer. Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Kjetil Oftedal <oftedal@gmail.com>	2011-10-20 15:17:22 -07:00
David S. Miller	045b7de9ca	sparc32: Remove non-kernel code from memcpy implementation. Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Kjetil Oftedal <oftedal@gmail.com>	2011-10-20 15:17:22 -07:00
Josip Rodin	a61b582954	sparc: Fix __atomic_add_unless() return value. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-04 02:47:40 -07:00
David S. Miller	56d205cc5c	sparc: Use popc when possible for ffs/__ffs/ffz. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-02 21:28:53 -07:00
David S. Miller	ef7c4d4675	sparc: Use popc if possible for hweight routines. Just like powerpc, we code patch at boot time. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-02 21:28:50 -07:00
David S. Miller	e95ade0839	sparc: Minor tweaks to Niagara page copy/clear. Don't use floating point on Niagara2, use the traditional plain Niagara code instead. Unroll Niagara loops to 128 bytes for copy, and 256 bytes for clear. Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-02 21:28:32 -07:00
Stephen Rothwell	678624e401	sparc: rename atomic_add_unless Should have been done in commit 1af08a1407f4 ("This is in preparation for more generic atomic"). Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Arun Sharma <asharma@fb.com> Cc: David Miller <davem@davemloft.net> Cc: "Hans-Christian Egtvedt" <hans-christian.egtvedt@atmel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-27 12:53:36 -07:00
Arun Sharma	60063497a9	atomic: use <linux/atomic.h> This allows us to move duplicated code in <asm/atomic.h> (atomic_inc_not_zero() for now) to <linux/atomic.h> Signed-off-by: Arun Sharma <asharma@fb.com> Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-26 16:49:47 -07:00
David S. Miller	9fafbd8061	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6	2011-05-20 12:59:54 -07:00
Tkhai Kirill	b1054282d7	sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic When we are in the label cc_dword_align, registers %o0 and %o1 have the same last 2 bits, but it's not guaranteed one of them is zero. So we can get unaligned memory access in label ccte. Example of parameters which lead to this: %o0=0x7ff183e9, %o1=0x8e709e7d, %g1=3 With the parameters I had a memory corruption, when the additional 5 bytes were rewritten. This patch corrects the error. One comment to the patch. We don't care about the third bit in %o1, because cc_end_cruft stores word or less. Signed-off-by: Tkhai Kirill <tkhai@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-11 21:35:04 -07:00
Daniel Hellstrom	1827237065	sparc32: removed unused code, implemented by generic code Signed-off-by: Daniel Hellstrom <daniel@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-04-21 16:44:44 -07:00
Linus Torvalds	0586bed3e8	Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rtmutex: tester: Remove the remaining BKL leftovers lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause rtmutex: Simplify PI algorithm and make highest prio task get lock rwsem: Remove redundant asmregparm annotation rwsem: Move duplicate function prototypes to linux/rwsem.h rwsem: Unify the duplicate rwsem_is_locked() inlines rwsem: Move duplicate init macros and functions to linux/rwsem.h rwsem: Move duplicate struct rwsem declaration to linux/rwsem.h x86: Cleanup rwsem_count_t typedef rwsem: Cleanup includes locking: Remove deprecated lock initializers cred: Replace deprecated spinlock initialization kthread: Replace deprecated spinlock initialization xtensa: Replace deprecated spinlock initialization um: Replace deprecated spinlock initialization sparc: Replace deprecated spinlock initialization mips: Replace deprecated spinlock initialization cris: Replace deprecated spinlock initialization alpha: Replace deprecated spinlock initialization rtmutex-tester: Remove BKL tests	2011-03-15 18:28:30 -07:00
Akinobu Mita	e637804c33	sparc: use bitmap_set() Use bitmap_set() instead of calling __set_bit() each bit. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2011-02-08 22:52:53 -08:00
Thomas Gleixner	24774fbdea	sparc: Replace deprecated spinlock initialization SPIN_LOCK_UNLOCK is deprecated. Use the lockdep capable variant instead. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David S. Miller <davem@davemloft.net>	2011-01-27 12:30:37 +01:00
David S. Miller	0f58189d4a	sparc64: Make lock backoff really a NOP on UP builds. As noticed by Mikulas Patocka, the backoff macros don't completely nop out for UP builds, we still get a branch always and a delay slot nop. Fix this by making the branch to the backoff spin loop selective, then we can nop out the spin loop completely. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-18 22:53:26 -07:00
Mikulas Patocka	6ec274750c	sparc64: simple microoptimizations for atomic functions Simple microoptimizations for sparc64 atomic functions: Save one instruction by using a delay slot. Use %g1 instead of %g7, because %g1 is written earlier. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-18 22:51:08 -07:00
David S. Miller	9b3bb86aca	sparc64: Make rwsems 64-bit. Basically tip-off the powerpc code, use a 64-bit type and atomic64_t interfaces for the implementation. This gets us off of the by-hand asm code I wrote, which frankly I think probably ruins I-cache hit rates. The idea was the keep the call chains less deep, but anything taking the rw-semaphores probably is also calling other stuff and therefore already has allocated a stack-frame. So no real stack frame savings ever. Ben H. has posted patches to make powerpc use 64-bit too and with some abstractions we can probably use a shared header file somewhere. With suggestions from Sam Ravnborg. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-08-17 22:49:26 -07:00
David S. Miller	035df35d96	sparc64: Allocate sufficient stack space in ftrace stubs. 128 bytes is sufficient for the register window save area, but the calling conventions allow the callee to save up to 6 incoming argument registers into the stack frame after the register window save area. This means a minimal stack frame is 176 bytes (128 + (6 * 8)). This fixes random crashes when using the function tracer. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-13 18:59:02 -07:00
David S. Miller	9960e9e894	sparc64: Add function graph tracer support. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-12 22:37:26 -07:00
David S. Miller	a71d1d6bb1	sparc64: Give a stack frame to the ftrace call sites. It's the only way we'll be able to implement the function graph tracer properly. A positive is that we no longer have to worry about the linker over-optimizing the tail call, since we don't use a tail call any more. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-12 22:37:15 -07:00
David S. Miller	ddacd0bc70	sparc64: Kill CONFIG_STACK_DEBUG code. The generic stack tracer does this job just as well. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-12 22:36:03 -07:00
David S. Miller	63b7549573	sparc64: Add HAVE_FUNCTION_TRACE_MCOUNT_TEST and tidy up. Check function_trace_stop at ftrace_caller Toss mcount_call and dummy call of ftrace_stub, unnecessary. Document problems we'll have if the final kernel image link ever turns on relaxation. Properly size 'ftrace_call' so it looks right when inspecting instructions under gdb et al. Signed-off-by: David S. Miller <davem@davemloft.net>	2010-04-12 22:35:24 -07:00
David S. Miller	4d14a45985	sparc: Stop trying to be so fancy and use __builtin_{memcpy,memset}() This mirrors commit `ff60fab71b` (x86: Use __builtin_memset and __builtin_memcpy for memset/memcpy) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-10 23:32:10 -08:00
David S. Miller	fb34035e7b	sparc: Use __builtin_object_size() to validate the buffer size for copy_from_user() This mirrors x86 commit `9f0cf4adb6` (x86: Use __builtin_object_size() to validate the buffer size for copy_from_user()) Signed-off-by: David S. Miller <davem@davemloft.net>	2009-12-10 23:05:23 -08:00

1 2

84 Commits