This patch adds an optimised version of skb_cow that avoids the copy if
the header can be modified even if the rest of the payload is cloned.
This can be used in encapsulating paths where we only need to modify the
header. As it is, this can be used in PPPOE and bridging.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The clone argument is only used by one caller and that caller can clone
the packet itself. This patch moves the clone call into the caller and
kills the clone argument.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the hdr variable (which is copied into the skb)
and instead sets the header directly in the skb.
It also uses __skb_push instead of skb_push since we've just checked
using skb_cow for enough head room.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function __pppoe_xmit modifies the skb data and therefore it needs
to copy and skb data if it's cloned.
In fact, it currently allocates a new skb so that it can return 0 in
case of error without freeing the original skb. This is totally wrong
because returning zero is meant to indicate congestion whereupon pppoe
is supposed to wake up the upper layer once the congestion subsides.
This makes sense for ppp_async and ppp_sync but is out-of-place for
pppoe. This patch makes it always return 1 and free the skb.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The skb_unshare_check call needs to be made before pskb_may_pull,
not after.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the sctp_sockaddr_entry is now RCU enabled as part of
the patch to synchronize sctp_localaddr_list, it makes sense to
change all handling of these entries to RCU. This includes the
sctp_bind_addrs structure and it's list of bound addresses.
This list is currently protected by an external rw_lock and that
looks like an overkill. There are only 2 writers to the list:
bind()/bindx() calls, and BH processing of ASCONF-ACK chunks.
These are already seriealized via the socket lock, so they will
not step on each other. These are also relatively rare, so we
should be good with RCU.
The readers are varied and they are easily converted to RCU.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Sridhar Samdurala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sctp_localaddr_list is modified dynamically via NETDEV_UP
and NETDEV_DOWN events, but there is not synchronization
between writer (even handler) and readers. As a result,
the readers can access an entry that has been freed and
crash the sytem.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Sridhar Samdurala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/sched/sch_cbq.c: In function 'cbq_enqueue':
net/sched/sch_cbq.c:383: warning: 'ret' may be used uninitialized in this function
has been verified to be a bogus case. So let's shut it up.
Signed-off-by: Satyam Sharma <satyam@infradead.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit 95c385 broke proper source address selection for cases in which
there is a address which is makred 'deprecated'. The commit mistakenly
changed ifa->flags to ifa_result->flags (probably copy/paste error from a
few lines above) in the 'Rule 3' address selection code.
The patch restores the previous RFC-compliant behavior.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
When NR_CPUS is smaller than the cpu probed, let the user
know that the cpu won't be used.
Suggested by Al Viro.
Signed-off-by: David S. Miller <davem@davemloft.net>
As noted by Al Viro, when we try to call prom_set_trap_table()
in the SMP trampoline code we try to take the PROM call spinlock
which doesn't work because the current thread pointer isn't
valid yet and lockdep depends upon that being correct.
Furthermore, we cannot set the current thread pointer register
because it can't be properly dereferenced until we return from
prom_set_trap_table(). Kernel TLB misses only work after that
call.
So do the PROM call to set the trap table directly instead of
going through the OBP library C code, and thus avoid the lock
altogether.
These calls are guarenteed to be serialized fully.
Since there are now no calls to the prom_set_trap_table{_sun4v}()
library functions, they can be deleted.
Signed-off-by: David S. Miller <davem@davemloft.net>
Taking a cpu offline removes the cpu from the online mask before the
CPU_DEAD notification is done. The clock events layer does the cleanup
of the dead CPU from the CPU_DEAD notifier chain. tick_do_timer_cpu is
used to avoid xtime lock contention by assigning the task of jiffies
xtime updates to one CPU. If a CPU is taken offline, then this
assignment becomes stale. This went unnoticed because most of the time
the offline CPU went dead before the online CPU reached __cpu_die(),
where the CPU_DEAD state is checked. In the case that the offline CPU did
not reach the DEAD state before we reach __cpu_die(), the code in there
goes to sleep for 100ms. Due to the stale time update assignment, the
system is stuck forever.
Take the assignment away when a cpu is not longer in the cpu_online_mask.
We do this in the last call to tick_nohz_stop_sched_tick() when the offline
CPU is on the way to the final play_dead() idle entry.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
When a cpu goes offline it is removed from the broadcast masks. If the
mask becomes empty the code shuts down the broadcast device. This is
wrong, because the broadcast device needs to be ready for the online
cpu going idle (into a c-state, which stops the local apic timer).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The jinxed VAIO refuses to resume without hitting keys on the keyboard
when this is not enforced. It is unclear why the cpu ends up in a lower
C State without notifying the clock events layer, but enforcing the
oneshot broadcast here is safe.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reevaluate C/P/T states when a cpu becomes online. This avoids
the caching of the broadcast information in the clockevents layer.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Len Brown <len.brown@intel.com>
Timekeeping resume adjusts xtime by adding the slept time in seconds and
resets the reference value of the clock source (clock->cycle_last).
clock->cycle last is used to calculate the delta between the last xtime
update and the readout of the clock source in __get_nsec_offset(). xtime
plus the offset is the current time. The resume code ignores the delta
which had already elapsed between the last xtime update and the actual
time of suspend. If the suspend time is short, then we can see time
going backwards on resume.
Suspend:
offs_s = clock->read() - clock->cycle_last;
now = xtime + offs_s;
timekeeping_suspend_time = read_rtc();
Resume:
sleep_time = read_rtc() - timekeeping_suspend_time;
xtime.tv_sec += sleep_time;
clock->cycle_last = clock->read();
offs_r = clock->read() - clock->cycle_last;
now = xtime + offs_r;
if sleep_time_seconds == 0 and offs_r < offs_s, then time goes
backwards.
Fix this by storing the offset from the last xtime update and add it to
xtime during resume, when we reset clock->cycle_last:
sleep_time = read_rtc() - timekeeping_suspend_time;
xtime.tv_sec += sleep_time;
xtime += offs_s; /* Fixup xtime offset at suspend time */
clock->cycle_last = clock->read();
offs_r = clock->read() - clock->cycle_last;
now = xtime + offs_r;
Thanks to Marcelo for tracking this down on the OLPC and providing the
necessary details to analyze the root cause.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Tosatti <marcelo@kvack.org>
Lockdep complains about the access of rtc in timekeeping_suspend
inside the interrupt disabled region of the write locked xtime lock.
Move the access outside.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <johnstul@us.ibm.com>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
drivers/net/pcmcia/3c589_cs: fix port configuration switcheroo
sk98lin: resurrect driver
ucc_geth: fix compilation
mv643xx_eth: Fix tx_bytes stats calculation
As struct iw_point is bi-directional payload, we should copy back the content
[PATCH] bcm43xx: Fix cancellation of work queue crashes
spidernet: fix interrupt reason recognition
ehea: fix last_rx update
ehea: propagate physical port state
Fix a lock problem in generic phy code
sky2: restore multicast list on resume and other ops
atl1: disable broken 64-bit DMA
This reverts commit e1abecc489.
The driver works on some hardware that skge doesn't handle yet.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Currently qe_bd_t is used in the macro call -- dma_unmap_single,
which is a no-op on PPC32, thus error is hidden today. Starting
with 2.6.24, macro will be replaced by the empty static function,
and erroneous use of qe_bd_t will trigger compilation error.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add Guards around TIOCSLCKTRMIOS and TIOCGLCKTRMIOS.
Several architectures are still broken. Put temporary-for-2.6.23 ifdef guards
around the offending code.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by:: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.linux-xtensa.org/kernel/xtensa-feed:
[patch 1/2] Xtensa: enable arbitary tty speed setting ioctls
[patch 2/2] xtensa console.c: remove duplicate #include
[XTENSA] Add support for cache-aliasing
[XTENSA] Add kernel module support
[XTENSA] Add support for executable/non-executable feature in the mmu
[XTENSA] Use the generic version of get_order
[XTENSA] Initialize semaphore_wake_lock
[XTENSA] Add typecast macro for constants
[XTENSA] Fix timer instabilities.
[XTENSA] Fix fadvise64_64
[XTENSA] Remove extraneous include statement
[XTENSA] Move string-io functions to io.c from pci.c
[XTENSA] Move pre-initialized structures to init_task.c
[XTENSA] Add freestanding option to CFLAGS
[XTENSA] Add getpgrp system-call to unistd.h
[XTENSA] add missing system calls
[XTENSA] fix wrong usage of __init and __initdata in traps.c
(with no apologies to C Heston)
On Mon, 2007-10-09 at 21:00 +0800, Herbert Xu wrote:
On Sun, Sep 02, 2007 at 01:11:29PM +0000, Christian Kujau wrote:
> >
> > after upgrading to 2.6.23-rc5 (and applying davem's fix [0]), lockdep
> > was quite noisy when I tried to shape my external (wireless) interface:
> >
> > [ 6400.534545] FahCore_78.exe/3552 just changed the state of lock:
> > [ 6400.534713] (&dev->ingress_lock){-+..}, at: [<c038d595>]
> > netif_receive_skb+0x2d5/0x3c0
> > [ 6400.534941] but this lock took another, soft-read-irq-unsafe lock in the
> > past:
> > [ 6400.535145] (police_lock){-.--}
>
> This is a genuine dead-lock. The police lock can be taken
> for reading with softirqs on. If a second CPU tries to take
> the police lock for writing, while holding the ingress lock,
> then a softirq on the first CPU can dead-lock when it tries
> to get the ingress lock.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) Comments suggest that setting optlen to zero will unbind
the socket from whatever device it might be attached to. This
hasn't been the case since at least 2.2.x because the first thing
this function does is return -EINVAL if 'optlen' is less than
sizeof(int).
This check also means that passing in a two byte string doesn't
work so well. It's almost as if this code was testing with "eth?"
patterned strings and nothing else :-)
Fix this by breaking the logic of this facility out into a
seperate function which validates optlen more appropriately.
The optlen==0 and small string cases now work properly.
2) We should reset the cached route of the socket after we have made
the device binding changes, not before.
Reported by Ben Greear.
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/cooloney/blackfin-2.6:
Blackfin arch: fix some bugs in lib/string.h functions found by our string testing modules
Blackfin arch: fix the aliased write macros
Blackfin arch: Update/Fix PM support add new pm_ops valid
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 4567/1: Fix 'Oops - undefined instruction' when CONFIG_VFP=y on non VFP device
[ARM] realview: disable second GIC on RevB MPCore platforms
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] 20Kc: Disable use of WAIT instruction.
[MIPS] Workaround for 4Kc machine check exception
[MIPS] Malta: Fix off by one bug in interrupt handler.
[MIPS] No ide_default_io_base() if PCI IDE was not found
[MIPS] Add #include <linux/profile.h> to arch/mips/kernel/time.c
[MIPS] N32 needs to use compat_sys_futimesat
[MIPS] rtlx: Fix build error.
[MIPS] rtlx: fix int vs. long bug.
A guest context switch to an uncached cr3 can require allocation of
shadow pages, but we only recycle shadow pages in kvm_mmu_page_fault().
Move shadow page recycling to mmu_topup_memory_caches(), which is called
from both the page fault handler and from guest cr3 reload.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit aaf68cfbf2 added a bias
to sk_inuse, so this test for an unused socket now fails. So no
sockets get closed because they are old (they might get closed
if the client closed them).
This bug has existed since 2.6.21-rc1.
Thanks to Wolfgang Walter for finding and reporting the bug.
Cc: Wolfgang Walter <wolfgang.walter@studentenwerk.mhn.de>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 02a5e0acb3 ("BLOCK: Hide the
contents of linux/bio.h if CONFIG_BLOCK=n") broke the kernel build for
the CONFIG_COMPAT && !CONFIG_BLOCK case:
CC fs/compat_ioctl.o
In file included from include/linux/raid/md_k.h:19,
from include/linux/raid/md.h:54,
from fs/compat_ioctl.c:25:
include/linux/raid/../../../drivers/md/dm-bio-list.h: In bio_list_:
include/linux/raid/../../../drivers/md/dm-bio-list.h:40: error: dereferencing pointer to incomplete type
include/linux/raid/../../../drivers/md/dm-bio-list.h: In bio_list_:
include/linux/raid/../../../drivers/md/dm-bio-list.h:48: error: dereferencing pointer to incomplete type
include/linux/raid/../../../drivers/md/dm-bio-list.h:51: error: dereferencing pointer to incomplete type
include/linux/raid/../../../drivers/md/dm-bio-list.h: In bio_list_:
include/linux/raid/../../../drivers/md/dm-bio-list.h:64: error: dereferencing pointer to incomplete type
include/linux/raid/../../../drivers/md/dm-bio-list.h: In bio_list_merge_:
include/linux/raid/../../../drivers/md/dm-bio-list.h:78: error: dereferencing pointer to incomplete type
include/linux/raid/../../../drivers/md/dm-bio-list.h: In bio_list_:
include/linux/raid/../../../drivers/md/dm-bio-list.h:90: error: dereferencing pointer to incomplete type
include/linux/raid/../../../drivers/md/dm-bio-list.h:94: error: dereferencing pointer to incomplete type
make[1]: *** [fs/compat_ioctl.o] Error 1
make: *** [fs] Error 2
Signed-off-by: Andreas Herrmann <aherrman@arcor.de>
Acked-By: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Should add some comments for the tag barriers (they won't be so important
if we can switch over to the explicit _lock bitops, but for now we should
make it clear).
Jens' original patch said a barrier after the test_and_clear_bit was also
required. I can't see why (and it would prevent the use of the _lock bitop).
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
--
A crash upon booting that is caused by bcm43xx has been reported [1] and
found to be due to a work queue being reinitialized while work on that
queue is still pending. This fix modifies the shutdown of work queues and
prevents periodic work from being requeued during shutdown. With this patch,
no more crashes on reboot were observed by the original reporter. I do not
get that particular failure on my system; however, when running a large
number of ifdown/ifup sequences, my system would kernel panic with the
'caps lock' light blinking at roughly a 1 Hz rate. In addition, there were
infrequent failures in the firmware that resulted in 'IRQ READY TIMEOUT'
errors. With this patch, no more of the first type of failure occur, and
incidence of the second type is greatly reduced.
[1] http://bugzilla.kernel.org/show_bug.cgi?id=8937
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Acked-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Another issue with 20Kc's WAIT, waiting for more details. With the
2.6.23 release immindent simply disable the use of WAIT instead of a
more fancy workaround.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Fairly cosmetic as it would only affect VSMP / SMTC kernels that don't
use vectored interrupts.
Found by Beth.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Revert b543858209 and add
no_pci_devices() check to avoid crash due to early calling of
pci_get_class().
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
It refer to CPU_PROFILING.
arch/mips/kernel/time.c: In function 'local_timer_interrupt':
arch/mips/kernel/time.c:142: error: implicit declaration of function 'profile_tick'
arch/mips/kernel/time.c:142: error: 'CPU_PROFILING' undeclared (first use in this function)
arch/mips/kernel/time.c:142: error: (Each undeclared identifier is reported only once
arch/mips/kernel/time.c:142: error: for each function it appears in.)
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
CC arch/mips/kernel/rtlx.o
cc1: warnings being treated as errors
arch/mips/kernel/rtlx.c:59: warning: 'irq' defined but not used
arch/mips/kernel/rtlx.c:60: warning: 'irq_num' defined but not used
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
CC arch/mips/kernel/rtlx.o
arch/mips/kernel/rtlx.c: In function 'rtlx_init':
arch/mips/kernel/rtlx.c:114: warning: format '%x' expects type 'unsigned int', but argument 3 has type 'long unsigned int'
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>