So, forever, we've had this ptrace_signal_deliver implementation
which tries to handle all of the nasties that can occur when the
debugger looks at a process about to take a signal. It's meant
to address all of these issues inside of the kernel so that the
debugger need not be mindful of such things.
Problem is, this doesn't work.
The idea was that we should do the syscall restart business first, so
that the debugger captures that state. Otherwise, if the debugger for
example saves the child's state, makes the child execute something
else, then restores the saved state, we won't handle the syscall
restart properly because we lose the "we're in a syscall" state.
The code here worked for most cases, but if the debugger actually
passes the signal through to the child unaltered, it's possible that
we would do a syscall restart when we shouldn't have.
In particular this breaks the case of debugging a process under a gdb
which is being debugged by yet another gdb. gdb uses sigsuspend
to wait for SIGCHLD of the inferior, but if gdb itself is being
debugged by a top-level gdb we get a ptrace_stop(). The top-level gdb
does a PTRACE_CONT with SIGCHLD to let the inferior gdb see the
signal. But ptrace_signal_deliver() assumed the debugger would cancel
out the signal and therefore did a syscall restart, because the return
error was ERESTARTNOHAND.
Fix this by simply making ptrace_signal_deliver() a nop, and providing
a way for the debugger to control system call restarting properly:
1) Report a "in syscall" software bit in regs->{tstate,psr}.
It is set early on in trap entry to a system call and is fully
visible to the debugger via ptrace() and regsets.
2) Test this bit right before doing a syscall restart. We have
to do a final recheck right after get_signal_to_deliver() in
case the debugger cleared the bit during ptrace_stop().
3) Clear the bit in trap return so we don't accidently try to set
that bit in the real register.
As a result we also get a ptrace_{is,clear}_syscall() for sparc32 just
like sparc64 has.
M68K has this same exact bug, and is now the only other user of the
ptrace_signal_deliver hook. It needs to be fixed in the same exact
way as sparc.
Signed-off-by: David S. Miller <davem@davemloft.net>
Forever we had a PTRACE_SUNOS_DETACH which was unconditionally
recognized, regardless of the personality of the process.
Unfortunately, this value is what ended up in the GLIBC sys/ptrace.h
header file on sparc as PTRACE_DETACH and PT_DETACH.
So continue to recognize this old value. Luckily, it doesn't conflict
with anything we actually care about.
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to be more liberal about the alignment of the buffer given to
us by sigaltstack(). The user should not need to be mindful of all of
the alignment constraints we have for the stack frame.
This mirrors how we handle this situation in clone() as well.
Also, we align the stack even in non-SA_ONSTACK cases so that signals
due to bad stack alignment can be delivered properly. This makes such
errors easier to debug and recover from.
Finally, add the sanity check x86 has to make sure we won't overflow
the signal stack.
This fixes glibc testcases nptl/tst-cancel20.c and
nptl/tst-cancelx20.c
Signed-off-by: David S. Miller <davem@davemloft.net>
We clobber %i1 as well as %i0 for these system calls,
because they give two return values.
Therefore, on error, we have to restore %i1 properly
or else the restart explodes since it uses the wrong
arguments.
This fixes glibc's nptl/tst-eintr1.c testcase.
Signed-off-by: David S. Miller <davem@davemloft.net>
The PROM library function prom_meminit() builds a table,
prom_phys_avail[], just so that probe_memory() in
arch/sparc/mm/fault.c can copy it into sp_banks[].
Just have prom_meminit() fill in the sp_banks[] array directly, and
remove duplicated sort() function.
Signed-off-by: David S. Miller <davem@davemloft.net>
The code in arch/sparc/prom/memory.c computes three tables, the list
of total memory, the list of available memory (total minus what
firmware is using), and the list of firmware taken memory.
Only the available memory list is even used.
Therefore, kill those unused tables and make prom_meminfo() return
just the available memory list.
Signed-off-by: David S. Miller <davem@davemloft.net>
Current limitations:
1) On SMP single stepping has some fundamental issues,
shared with other sw single-step architectures such
as mips and arm.
2) On 32-bit sparc we don't support SMP kgdb yet. That
requires some reworking of the IPI mechanisms and
infrastructure on that platform.
Signed-off-by: David S. Miller <davem@davemloft.net>
Completely unused, and it just makes the SMP message
passing code on 32-bit sparc look more complex than
it is.
Signed-off-by: David S. Miller <davem@davemloft.net>
Back around the same time we were bootstrapping the first 32-bit sparc
Linux kernel with a SunOS userland, we made the signal frame match
that of SunOS.
By the time we even started putting together a native Linux userland
for 32-bit Sparc we realized this layout wasn't sufficient for Linux's
needs.
Therefore we changed the layout, yet kept support for the old style
signal frame layout in there. The detection mechanism is that we had
sys_sigaction() start passing in a negative signal number to indicate
"new style signal frames please".
Anyways, no binaries exist in the world that use the old stuff. In
fact, I bet Jakub Jelinek and myself are the only two people who ever
had such binaries to be honest.
So let's get rid of this stuff.
I added an assertion using WARN_ON_ONCE() that makes sure 32-bit
applications are passing in that negative signal number still.
Signed-off-by: David S. Miller <davem@davemloft.net>
The following cleanups are now possible:
- arch/sparc/kernel/entry.S:ret_sys_call no longer has to be global
- arch/sparc/kernel/signal.c:sys_sigpause() can be removed
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
- mark timer_interrupt() static
- sparc_floppy_request_irq() prototype should use irq_handler_t
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Semaphores are no longer performance-critical, so a generic C
implementation is better for maintainability, debuggability and
extensibility. Thanks to Peter Zijlstra for fixing the lockdep
warning. Thanks to Harvey Harrison for pointing out that the
unlikely() was unnecessary.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
ext4 uses ZERO_PAGE(0) to zero out blocks. We need to export
different symbols in different arches for the usage of ZERO_PAGE
in modules.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Almost all implementations of pci_iomap() in the kernel, including the generic
lib/iomap.c one, copies the content of a struct resource into unsigned long's
which will break on 32 bits platforms with 64 bits resources.
This fixes all definitions of pci_iomap() to use resource_size_t. I also
"fixed" the 64bits arch for consistency.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1) ptrace should pass 'current' to task_user_regset_view()
2) When fetching general registers using a 64-bit view, and
the target is 32-bit, we have to convert.
3) Skip the whole register window get/set code block if
the user isn't asking to access anything in there.
Otherwise we have problems if the user doesn't have
an address space setup. Fetching ptrace register is
still valid at such a time, and ptrace does not try
to access the register window area of the regset.
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported by Adrian Bunk.
Just like in changeset a3f9985843
("[SPARC64]: Move kernel unaligned trap handlers into assembler
file.") we have to move the assembler bits into a seperate
asm file because as far as the compiler is concerned
these inline bits we're doing in unaligned.c are unreachable.
Signed-off-by: David S. Miller <davem@davemloft.net>
__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CC [M] arch/sparc/kernel/led.o
arch/sparc/kernel/led.c: In function 'led_blink':
arch/sparc/kernel/led.c:35: error: invalid use of undefined type 'struct
timer_list'
arch/sparc/kernel/led.c:35: error: 'jiffies' undeclared (first use in
this function)
arch/sparc/kernel/led.c:35: error: (Each undeclared identifier is
reported only once
arch/sparc/kernel/led.c:35: error: for each function it appears in.)
arch/sparc/kernel/led.c:36: error: 'avenrun' undeclared (first use in
this function)
arch/sparc/kernel/led.c:36: error: 'FSHIFT' undeclared (first use in
this function)
arch/sparc/kernel/led.c:36: error: 'HZ' undeclared (first use in this
function)
arch/sparc/kernel/led.c:37: error: invalid use of undefined type 'struct
timer_list'
arch/sparc/kernel/led.c:39: error: invalid use of undefined type 'struct
timer_list'
arch/sparc/kernel/led.c:40: error: invalid use of undefined type 'struct
timer_list'
arch/sparc/kernel/led.c:42: error: implicit declaration of function
'add_timer'
arch/sparc/kernel/led.c: In function 'led_write_proc':
arch/sparc/kernel/led.c:70: error: implicit declaration of function
'copy_from_user'
arch/sparc/kernel/led.c:84: error: implicit declaration of function
'del_timer_sync'
arch/sparc/kernel/led.c: In function 'led_init':
arch/sparc/kernel/led.c:109: error: implicit declaration of function
'init_timer'
arch/sparc/kernel/led.c:110: error: invalid use of undefined type
'struct timer_list'
make[1]: *** [arch/sparc/kernel/led.o] Error 1
Based upon original patch by Robert Reif.
Signed-off-by: David S. Miller <davem@davemloft.net>
The idea of this thing is we could save/restore the firmware's
palette when breaking in and out of the firmware prompt.
Only one driver implemented this (atyfb) and it's value is
questionable. If you're just debugging you don't really
care that the characters end up being purple or whatever.
And we can provide better debugging and firmware command
facilities with minimal in-kernel console I/O drivers.
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit d3d74453c3 ("hrtimer: fixup the
HRTIMER_CB_IRQSAFE_NO_SOFTIRQ fallback") broke several archs, and since
only Russell bothered to merge the fix, and Greg to ACK his arch, I'm
sending this for merger.
I have confirmation that the Alpha bit results in a booting kernel.
That leaves: blackfin, frv, sh and sparc untested.
The deadlock in question was found by Russell:
IRQ handle
-> timer_tick() - xtime seqlock held for write
-> update_process_times()
-> run_local_timers()
-> hrtimer_run_queues()
-> hrtimer_get_softirq_time() - tries to get a read lock
Now, Thomas assures me the fix is trivial, only do_timer() needs to be
done under the xtime_lock, and update_process_times() can savely be
removed from under it.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Greg Ungerer <gerg@uclinux.org>
CC: Richard Henderson <rth@twiddle.net>
CC: Bryan Wu <bryan.wu@analog.com>
CC: David Howells <dhowells@redhat.com>
CC: Paul Mundt <lethal@linux-sh.org>
CC: William Irwin <wli@holomorphy.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are no callers of this on the Sparc platforms.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
To allow flexible configuration of IDE introduce HAVE_IDE.
All archs except arm, um and s390 unconditionally select it.
For arm the actual configuration determine if IDE is supported.
This is a step towards introducing drivers/Kconfig for arm.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Russell King - ARM Linux <linux@arm.linux.org.uk>
Acked-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Background: I've implemented 1K/2K page tables for s390. These sub-page
page tables are required to properly support the s390 virtualization
instruction with KVM. The SIE instruction requires that the page tables
have 256 page table entries (pte) followed by 256 page status table entries
(pgste). The pgstes are only required if the process is using the SIE
instruction. The pgstes are updated by the hardware and by the hypervisor
for a number of reasons, one of them is dirty and reference bit tracking.
To avoid wasting memory the standard pte table allocation should return
1K/2K (31/64 bit) and 2K/4K if the process is using SIE.
Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means
the s390 version for pte_alloc_one cannot return a pointer to a struct
page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
cannot return a pointer to a pte either, since that would require more than
32 bit for the return value of pte_alloc_one (and the pte * would not be
accessible since its not kmapped).
Solution: The only solution I found to this dilemma is a new typedef: a
pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced with a
later patch. For everybody else it will be a (struct page *). The
additional problem with the initialization of the ptl lock and the
NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
a destructor pgtable_page_dtor. The page table allocation and free
functions need to call these two whenever a page table page is allocated or
freed. pmd_populate will get a pgtable_t instead of a struct page pointer.
To get the pgtable_t back from a pmd entry that has been installed with
pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page
call in free_pte_range and apply_to_pte_range.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the conversion factor between jiffies and milli- or microseconds is
not a single multiply or divide, as for the case of HZ == 300, we currently
do a multiply followed by a divide. The intervening result, however, is
subject to overflows, especially since the fraction is not simplified (for
HZ == 300, we multiply by 300 and divide by 1000).
This is exposed to the user when passing a large timeout to poll(), for
example.
This patch replaces the multiply-divide with a reciprocal multiplication on
32-bit platforms. When the input is an unsigned long, there is no portable
way to do this on 64-bit platforms there is no portable way to do this
since it requires a 128-bit intermediate result (which gcc does support on
64-bit platforms but may generate libgcc calls, e.g. on 64-bit s390), but
since the output is a 32-bit integer in the cases affected, just simplify
the multiply-divide (*3/10 instead of *300/1000).
The reciprocal multiply used can have off-by-one errors in the upper half
of the valid output range. This could be avoided at the expense of having
to deal with a potential 65-bit intermediate result. Since the intent is
to avoid overflow problems and most of the other time conversions are only
semiexact, the off-by-one errors were considered an acceptable tradeoff.
At Ralf Baechle's suggestion, this version uses a Perl script to compute
the necessary constants. We already have dependencies on Perl for kernel
compiles. This does, however, require the Perl module Math::BigInt, which
is included in the standard Perl distribution starting with version 5.8.0.
In order to support older versions of Perl, include a table of canned
constants in the script itself, and structure the script so that
Math::BigInt isn't required if pulling values from said table.
Running the script requires that the HZ value is available from the
Makefile. Thus, this patch also adds the Kconfig variable CONFIG_HZ to the
architectures which didn't already have it (alpha, cris, frv, h8300, m32r,
m68k, m68knommu, sparc, v850, and xtensa.) It does *not* touch the sh or
sh64 architectures, since Paul Mundt has dealt with those separately in the
sh tree.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Ralf Baechle <ralf@linux-mips.org>,
Cc: Sam Ravnborg <sam@ravnborg.org>,
Cc: Paul Mundt <lethal@linux-sh.org>,
Cc: Richard Henderson <rth@twiddle.net>,
Cc: Michael Starvik <starvik@axis.com>,
Cc: David Howells <dhowells@redhat.com>,
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>,
Cc: Hirokazu Takata <takata@linux-m32r.org>,
Cc: Geert Uytterhoeven <geert@linux-m68k.org>,
Cc: Roman Zippel <zippel@linux-m68k.org>,
Cc: William L. Irwin <sparclinux@vger.kernel.org>,
Cc: Chris Zankel <chris@zankel.net>,
Cc: H. Peter Anvin <hpa@zytor.com>,
Cc: Jan Engelhardt <jengelh@computergmbh.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Suppress A.OUT library support if CONFIG_ARCH_SUPPORTS_AOUT is not set.
Not all architectures support the A.OUT binfmt, so the ELF binfmt should not
be permitted to go looking for A.OUT libraries to load in such a case. Not
only that, but under such conditions A.OUT core dumps are not produced either.
To make this work, this patch also does the following:
(1) Makes the existence of the contents of linux/a.out.h contingent on
CONFIG_ARCH_SUPPORTS_AOUT.
(2) Renames dump_thread() to aout_dump_thread() as it's only called by A.OUT
core dumping code.
(3) Moves aout_dump_thread() into asm/a.out-core.h and makes it inline. This
is then included only where needed. This means that this bit of arch
code will be stored in the appropriate A.OUT binfmt module rather than
the core kernel.
(4) Drops A.OUT support for Blackfin (according to Mike Frysinger it's not
needed) and FRV.
This patch depends on the previous patch to move STACK_TOP[_MAX] out of
asm/a.out.h and into asm/processor.h as they're required whether or not A.OUT
format is available.
[jdike@addtoit.com: uml: re-remove accidentally restored code]
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mark arches that support A.OUT format by including the following in their
master Kconfig files:
config ARCH_SUPPORTS_AOUT
def_bool y
This should also be set if the arch provides compatibility A.OUT support for
an older arch, for instance x86_64 for i386 or sparc64 for sparc.
I've guessed at which arches don't, based on comments in the code, however I'm
sure that some of the ones I've marked as 'yes' actually should be 'no'.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC32]: Use regsets in arch_ptrace().
[SPARC64]: Use regsets in arch_ptrace().
[SPARC32]: Use regsets for ELF core dumping.
[SPARC64]: Use regsets for ELF core dumping.
[SPARC64]: Remove unintentional ptrace debugging messages.
[SPARC]: Move over to arch_ptrace().
[SPARC]: Remove PTRACE_SUN* handling.
[SPARC]: Kill DEBUG_PTRACE code.
[SPARC32]: Add user regset support.
[SPARC64]: Add user regsets.
[SPARC64]: Fix booting on non-zero cpu.
This patchset adds a flags variable to reserve_bootmem() and uses the
BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions
between crashkernel area and already used memory.
This patch:
Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE.
If that flag is set, the function returns with -EBUSY if the memory already
has been reserved in the past. This is to avoid conflicts.
Because that code runs before SMP initialisation, there's no race condition
inside reserve_bootmem_core().
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix powerpc build]
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: <linux-arch@vger.kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Supporting SunOS ptrace() is pretty pointless and these
kinds of quirks keep us from being able to share more
code with other platforms.
Signed-off-by: David S. Miller <davem@davemloft.net>