Annotate __switch_to() so that the function graph tracer does not try to
trace it. Use __notrace_funcgraph, as opposed to notrace, so that other
tracers can continue to trace __switch_to().
The reason that we don't want to trace __switch_to() with the function
graph tracer is because of how the return address stack in task_struct
is implemented. When we enter __switch_to we store the real return
address on prev's ret_stack. When we return from __switch_to() we've
patched the return address on the kernel stack to be
return_to_handler. Calling return_to_handler we do,
-> ftrace_return_to_handler()
-> ftrace_pop_return_ftrace()
Which tries to pop the real return address from current->ret_stack. The
problem being that we stored the return address on prev->ret_stack, but
current now points to next, and next->ret_stack doesn't contain the
correct return address (and is possibly even empty).
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Add both dynamic and static function graph tracer support for sh.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Enable kernel stack checking code in both the dynamic ftrace and mcount
code paths. Check the stack to see if it's overflowing and make sure
that the stack pointer contains an address that's either in init_stack
or after the bss.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
nilfs2: fix disorder in cp count on error during deleting checkpoints
nilfs2: fix lockdep warning between regular file and inode file
nilfs2: fix incorrect KERN_CRIT messages in case of write failures
nilfs2: fix hang problem of log writer which occurs after write failures
nilfs2: remove unlikely directive causing mis-conversion of error code
* 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
dma-debug: Fix the overlap() function to be correct and readable
oprofile: reset bt_lost_no_mapping with other stats
x86/oprofile: rename kernel parameter for architectural perfmon to arch_perfmon
signals: declare sys_rt_tgsigqueueinfo in syscalls.h
rcu: Mark Hierarchical RCU no longer experimental
dma-debug: Put all hash-chain locks into the same lock class
dma-debug: fix off-by-one error in overlap function
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (50 commits)
perf report: Add "Fractal" mode output - support callchains with relative overhead rate
perf_counter tools: callchains: Manage the cumul hits on the fly
perf report: Change default callchain parameters
perf report: Use a modifiable string for default callchain options
perf report: Warn on callchain output request from non-callchain file
x86: atomic64: Inline atomic64_read() again
x86: atomic64: Clean up atomic64_sub_and_test() and atomic64_add_negative()
x86: atomic64: Improve atomic64_xchg()
x86: atomic64: Export APIs to modules
x86: atomic64: Improve atomic64_read()
x86: atomic64: Code atomic(64)_read and atomic(64)_set in C not CPP
x86: atomic64: Fix unclean type use in atomic64_xchg()
x86: atomic64: Make atomic_read() type-safe
x86: atomic64: Reduce size of functions
x86: atomic64: Improve atomic64_add_return()
x86: atomic64: Improve cmpxchg8b()
x86: atomic64: Improve atomic64_read()
x86: atomic64: Move the 32-bit atomic64_t implementation to a .c file
x86: atomic64: The atomic64_t data type should be 8 bytes aligned on 32-bit too
perf report: Annotate variable initialization
...
Optimize cond_resched() by removing one conditional.
Currently cond_resched() checks system_state ==
SYSTEM_RUNNING in order to avoid scheduling before the
scheduler is running.
We can however, as per suggestion of Matt, use
PREEMPT_ACTIVE to accomplish that very same.
Suggested-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull the initial preempt_count value into a single
definition site.
Maintainers for: alpha, ia64 and m68k, please have a look,
your arch code is funny.
The header magic is a bit odd, but similar to the KERNEL_DS
one, CPP waits with expanding these macros until the
INIT_THREAD_INFO macro itself is expanded, which is in
arch/*/kernel/init_task.c where we've already included
sched.h so we're good.
Cc: tony.luck@intel.com
Cc: rth@twiddle.net
Cc: geert@linux-m68k.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus noticed how unclean and buggy the overlap() function is:
- It uses convoluted (and bug-causing) positive checks for
range overlap - instead of using a more natural negative
check.
- Even the positive checks are buggy: a positive intersection
check has four natural cases while we checked only for three,
missing the (addr < start && addr2 == end) case for example.
- The variables are mis-named, making it non-obvious how the
check was done.
- It needlessly uses u64 instead of unsigned long. Since these
are kernel memory pointers and we explicitly exclude highmem
ranges anyway we cannot ever overflow 32 bits, even if we
could. (and on 64-bit it doesnt matter anyway)
All in one, this function needs a total revamp. I used Linus's
suggestions minus the paranoid checks (we cannot overflow really
because if we get totally bad DMA ranges passed far more things
break in the systems than just DMA debugging). I also fixed a
few other small details i noticed.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Fix trace_print_seq()
kprobes: No need to unlock kprobe_insn_mutex
tracing/fastboot: Document the need of initcall_debug
trace_export: Repair missed fields
tracing: Fix stack tracer sysctl handling
In case memory is scarce, we now default to oom_cfqq. Once memory is
available again, we should allocate a new cfqq and stop using oom_cfqq for
a particular io context.
Once a new request comes in, check if we are using oom_cfqq, and if yes,
try to allocate a new cfqq.
Tested the patch by forcing the use of oom_cfqq and upon next request thread
realized that it was using oom_cfqq and it allocated a new cfqq.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
I overlooked SG_DXFER_TO_FROM_DEV support when I converted sg to use
the block layer mapping API (2.6.28).
Douglas Gilbert explained SG_DXFER_TO_FROM_DEV:
http://www.spinics.net/lists/linux-scsi/msg37135.html
=
The semantics of SG_DXFER_TO_FROM_DEV were:
- copy user space buffer to kernel (LLD) buffer
- do SCSI command which is assumed to be of the DATA_IN
(data from device) variety. This would overwrite
some or all of the kernel buffer
- copy kernel (LLD) buffer back to the user space.
The idea was to detect short reads by filling the original
user space buffer with some marker bytes ("0xec" it would
seem in this report). The "resid" value is a better way
of detecting short reads but that was only added this century
and requires co-operation from the LLD.
=
This patch changes the block layer mapping API to support this
semantics. This simply adds another field to struct rq_map_data and
enables __bio_copy_iov() to copy data from user space even with READ
requests.
It's better to add the flags field and kills null_mapped and the new
from_user fields in struct rq_map_data but that approach makes it
difficult to send this patch to stable trees because st and osst
drivers use struct rq_map_data (they were converted to use the block
layer in 2.6.29 and 2.6.30). Well, I should clean up the block layer
mapping API.
zhou sf reported this regiression and tested this patch:
http://www.spinics.net/lists/linux-scsi/msg37128.htmlhttp://www.spinics.net/lists/linux-scsi/msg37168.html
Reported-by: zhou sf <sxzzsf@gmail.com>
Tested-by: zhou sf <sxzzsf@gmail.com>
Cc: stable@kernel.org
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Currently, blk_scsi_ioctl_init() is not called since it lacks
an initcall marking. This causes the command table to be
unitialized, hence somce commands are block when they should
not have been.
This fixes a regression introduced by commit
018e044689
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Commit 1faa16d228 accidentally broke
the bdi congestion wait queue logic, causing us to wait on congestion
for WRITE (== 1) when we really wanted BLK_RW_ASYNC (== 0) instead.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
The bt_lost_no_mapping is not getting reset at the start of a
profiling run, thus the oprofiled.log shows erroneous values for this
statistic. The attached patch fixes this problem.
Signed-off-by: Maynard Johnson <maynardj@us.ibm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch converts the sh architecture to use the new linker script
macros in include/asm-generic/vmlinux.lds.h.
Signed-off-by: Tim Abbott <tabbott@ksplice.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: linux-sh@vger.kernel.org
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] w83627hf_wdt.c: add support for the W83627EHF support
[WATCHDOG] SA1100 watchdog maximum timeout
[WATCHDOG] w83697ug, fix lock imbalance
[WATCHDOG] drivers/watchdog/bcm47xx_wdt.c: Remove unnecessary semicolons
Remove call to the mx3fb_set_par() and the mx3fb_blank() before the
register_framebuffer().
This fixes a problem with uninitialized the fb_info->mm_lock mutex
introduced by the commit 537a1bf059 " fbdev: add mutex for fb_mmap
locking"
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove call to the fb_set_par() before the register_framebuffer().
This fixes a problem with uninitialized the fb_info->mm_lock mutex
introduced by the commit 537a1bf059 " fbdev: add mutex for fb_mmap
locking"
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: when ATTR_READONLY is set, only clear write bits on non-directories
cifs: remove cifsInodeInfo->inUse counter
cifs: convert cifs_get_inode_info and non-posix readdir to use cifs_iget
[CIFS] update cifs version number
cifs: add and use CIFSSMBUnixSetFileInfo for setattr calls
cifs: make a separate function for filling out FILE_UNIX_BASIC_INFO
cifs: rename CIFSSMBUnixSetInfo to CIFSSMBUnixSetPathInfo
cifs: add pid of initiating process to spnego upcall info
cifs: fix regression with O_EXCL creates and optimize away lookup
cifs: add new cifs_iget function and convert unix codepath to use it
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
cxgb3: Fix crash caused by stashing wrong netdev_queue
ixgbe: Fix coexistence of FCoE and Flow Director in 82599
memory barrier: adding smp_mb__after_lock
net: adding memory barrier to the poll and receive callbacks
netpoll: Fix carrier detection for drivers that are using phylib
includecheck fix: include/linux, rfkill.h
p54: tx refused but queue active
Atheros Kconfig needs to be dependent on WLAN_80211
mac80211: fix docbook
mac80211_hwsim: avoid NULL access
ssb: Add support for 4318E
b43: Add support for 4318E
zd1211rw: adding SONY IFU-WLM2 (054c:0257) as a zd1211b device
zd1211rw: 07b8:6001 is a ZD1211B
r6040: bump driver version to 0.24 and date to 08 July 2009
r6040: restore MIER register correctly when IRQ line is shared
ipv4: Fix fib_trie rebalancing, part 4 (root thresholds)
davinci_emac: fix kernel oops when changing MAC address while interface is down
igb: set lan id prior to configuring phy
mac80211: minstrel: avoid accessing negative indices in rix_to_ndx()
...
Looks like the change in ad361c9884
wasn't compile tested.
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The short name of the achitecture is 'arch_perfmon'. This patch
changes the kernel parameter to use this name.
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit c3a8c5b6 ("cxgb3: move away from LLTX") exposed a bug in how
cxgb3 looks up the netdev_queue it stashes away in a qset during
initialization. For multiport devices, the TX queue index it uses is
offset by the first_qset index of each port. This leads to a crash
once LLTX is removed, since hard_start_xmit is called with one TX
queue lock held, while the TX reclaim timer task grabs a different
(wrong) TX queue lock when it frees skbs.
Fix this by removing the first_qset offset used to look up the TX
queue passed into t3_sge_alloc_qset() from setup_sge_qsets().
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Divy Le Ray <divy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix coexistence of Fiber Channel over Ethernet (FCoE) and Flow Director (FDIR)
in 82599 and remove the disabling of FDIR when FCoE is enabled.
Currently, FDIR is turned off when FCoE is enabled under the assumption that
FCoE is always enabled with DCB being turned on. However, FDIR does not have
to be turned off all the time when FCoE is enabled since FCoE can be enabled
without DCB being turned on, e.g., use link pause only. This patch makes sure
that when DCB is turned on or off, FDIR is turned on or off correspondingly;
and when FCoE is enabled, it does not disable FDIR, rather, it will have FDIR
set up properly so FCoE and FDIR can coexist regardless of DCB being on or off.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding smp_mb__after_lock define to be used as a smp_mb call after
a lock.
Making it nop for x86, since {read|write|spin}_lock() on x86 are
full memory barriers.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding memory barrier after the poll_wait function, paired with
receive callbacks. Adding fuctions sock_poll_wait and sk_has_sleeper
to wrap the memory barrier.
Without the memory barrier, following race can happen.
The race fires, when following code paths meet, and the tp->rcv_nxt
and __add_wait_queue updates stay in CPU caches.
CPU1 CPU2
sys_select receive packet
... ...
__add_wait_queue update tp->rcv_nxt
... ...
tp->rcv_nxt check sock_def_readable
... {
schedule ...
if (sk->sk_sleep && waitqueue_active(sk->sk_sleep))
wake_up_interruptible(sk->sk_sleep)
...
}
If there was no cache the code would work ok, since the wait_queue and
rcv_nxt are opposit to each other.
Meaning that once tp->rcv_nxt is updated by CPU2, the CPU1 either already
passed the tp->rcv_nxt check and sleeps, or will get the new value for
tp->rcv_nxt and will return with new data mask.
In both cases the process (CPU1) is being added to the wait queue, so the
waitqueue_active (CPU2) call cannot miss and will wake up CPU1.
The bad case is when the __add_wait_queue changes done by CPU1 stay in its
cache, and so does the tp->rcv_nxt update on CPU2 side. The CPU1 will then
endup calling schedule and sleep forever if there are no more data on the
socket.
Calls to poll_wait in following modules were ommited:
net/bluetooth/af_bluetooth.c
net/irda/af_irda.c
net/irda/irnet/irnet_ppp.c
net/mac80211/rc80211_pid_debugfs.c
net/phonet/socket.c
net/rds/af_rds.c
net/rfkill/core.c
net/sunrpc/cache.c
net/sunrpc/rpc_pipe.c
net/tipc/socket.c
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
cifs: when ATTR_READONLY is set, only clear write bits on non-directories
On windows servers, ATTR_READONLY apparently either has no meaning or
serves as some sort of queue to certain applications for unrelated
behavior. This MS kbase article has details:
http://support.microsoft.com/kb/326549/
Don't clear the write bits directory mode when ATTR_READONLY is set.
Reported-by: pouchat@peewiki.net
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: remove cifsInodeInfo->inUse counter
It was purported to be a refcounter of some sort, but was never
used that way. It never served any purpose that wasn't served equally well
by the I_NEW flag.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: convert cifs_get_inode_info and non-posix readdir to use cifs_iget
Rather than allocating an inode and filling it out, have
cifs_get_inode_info fill out a cifs_fattr and call cifs_iget. This means
a pretty hefty reorganization of cifs_get_inode_info.
For the readdir codepath, add a couple of new functions for filling out
cifs_fattr's from different FindFile response infolevels.
Finally, remove cifs_new_inode since there are no more callers.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: add and use CIFSSMBUnixSetFileInfo for setattr calls
When there's an open filehandle, SET_FILE_INFO is apparently preferred
over SET_PATH_INFO. Add a new variant that sets a FILE_UNIX_INFO_BASIC
infolevel via SET_FILE_INFO and switch cifs_setattr_unix to use the
new call when there's an open filehandle available.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: make a separate function for filling out FILE_UNIX_BASIC_INFO
The SET_FILE_INFO variant will need to do the same thing here. Break
this code out into a separate function that both variants can call.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: rename CIFSSMBUnixSetInfo to CIFSSMBUnixSetPathInfo
...in preparation of adding a SET_FILE_INFO variant.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
cifs: add pid of initiating process to spnego upcall info
This will allow the upcall to poke in /proc/<pid>/environ and get
the value of the $KRB5CCNAME env var for the process.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Add support for the W83627EHF/EF and W83627EHG/EG chipsets.
Signed-off-by: Slobodan Tomić <stomic@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
This patch replaces the hardcoded 255 seconds limit for a real limit based on
oscr_freq.
Also, the 'firmware_version' field is changed to '1' to allow the user
space application to easily detect that this driver supports a higher
maximum timeout.
Signed-off-by: Raphael Assenat <raph@8D.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Don't forget to unlock io_lock when w83697ug_select_wd_register fails in
wdt_ctrl.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Using early netconsole and gianfar driver this error pops up:
netconsole: timeout waiting for carrier
It appears that net/core/netpoll.c:netpoll_setup() is using
cond_resched() in a loop waiting for a carrier.
The thing is that cond_resched() is a no-op when system_state !=
SYSTEM_RUNNING, and so drivers/net/phy/phy.c's state_queue is never
scheduled, therefore link detection doesn't work.
I belive that the main problem is in cond_resched()[1], but despite
how the cond_resched() story ends, it might be a good idea to call
msleep(1) instead of cond_resched(), as suggested by Andrew Morton.
[1] http://lkml.org/lkml/2009/7/7/463
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit ad361c98 ("Remove multiple KERN_ prefixes from printk formats")
broke the build for fealnx because it added some "printk(PR_CONT ..."
calls, when PR_CONT doesn't exist; it should be "printk(KERN_CONT ..."
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove not needed locking of the fb_info->mm_lock mutex before a
frambuffer is registered.
This fixes a problem with uninitialized the fb_info->mm_lock mutex
introduced by the commit 537a1bf059 " fbdev: add mutex for fb_mmap
locking"
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove call to the fsl_diu_set_par before the register_framebuffer().
This fixes a problem with uninitialized the fb_info->mm_lock mutex
introduced by the commit 537a1bf059 " fbdev: add mutex for fb_mmap
locking"
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Tested-by: "Kai Jiang" <b18973@freescale.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fix the following 'make includecheck' warning:
include/linux/rfkill.h: linux/types.h is included more than once.
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>