Commit Graph

241050 Commits

Author SHA1 Message Date
Linus Torvalds
f23eb2b2b2 tty: stop using "delayed_work" in the tty layer
Using delayed-work for tty flip buffers ends up causing us to wait for
the next tick to complete some actions.  That's usually not all that
noticeable, but for certain latency-critical workloads it ends up being
totally unacceptable.

As an extreme case of this, passing a token back-and-forth over a pty
will take two ticks per iteration, so even just a thousand iterations
will take 8 seconds assuming a common 250Hz configuration.

Avoiding the whole delayed work issue brings that ping-pong test-case
down to 0.009s on my machine.

In more practical terms, this latency has been a performance problem for
things like dive computer simulators (simulating the serial interface
using the ptys) and for other environments (Alan mentions a CP/M emulator).

Reported-by: Jef Driesen <jefdriesen@telenet.be>
Acked-by: Greg KH <gregkh@suse.de>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22 16:17:32 -07:00
Venkateswararao Jujjuri (JV)
68da9ba4ee [net/9p]: Introduce basic flow-control for VirtIO transport.
Recent zerocopy work in the 9P VirtIO transport maps and pins
user buffers into kernel memory for the server to work on them.
Since the user process can initiate this kind of pinning with a simple
read/write call, thousands of IO threads initiated by the user process can
hog the system resources and could result into denial of service.

This patch introduces flow control to avoid that extreme scenario.

The ceiling limit to avoid denial of service attacks is set to relatively
high (nr_free_pagecache_pages()/4) so that it won't interfere with
regular usage, but can step in extreme cases to limit the total system
hang. Since we don't have a global structure to accommodate this variable,
I choose the virtio_chan as the home for this.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Reviewed-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 16:32:50 -05:00
M. Mohan Kumar
aaf0ef1d2b 9p: use the updated offset given by generic_write_checks
Without this fix, even if a file is opened in O_APPEND mode, data will be
written at current file position instead of end of file.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 16:32:49 -05:00
Venkateswararao Jujjuri (JV)
316ad5501c [net/9p] Don't re-pin pages on retrying virtqueue_add_buf().
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 16:32:48 -05:00
Venkateswararao Jujjuri (JV)
a01a984035 [net/9p] Set the condition just before waking up.
Given that the sprious wake-ups are common, we need to move the
condition setting right next to the wake_up().  After setting the condition
to req->status = REQ_STATUS_RCVD, sprious wakeups may cause the
virtqueue back on the free list for someone else to use.
This may result in kernel panic while relasing the pinned pages
in p9_release_req_pages().

Also rearranged the while loop in req_done() for better redability.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 16:32:47 -05:00
Venkateswararao Jujjuri (JV)
53bda3e5b4 [net/9p] unconditional wake_up to proc waiting for space on VirtIO ring
Process may wait to get space on VirtIO ring to send a transaction to
VirtFS server. Current code just does a conditional wake_up() which
means only one process will be woken up even if multiple processes
are waiting.

This fix makes the wake_up unconditional. Hence we won't have any
processes waiting for-ever.

Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 16:32:19 -05:00
Aneesh Kumar K.V
42869c8ada fs/9p: Add v9fs_dentry2v9ses
Add the new static inline and use the same

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 15:43:36 -05:00
Aneesh Kumar K.V
7add697a3d fs/9p: Attach writeback_fid on first open with WR flag
We don't need writeback fid if we are only doing O_RDONLY open

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 15:43:36 -05:00
Aneesh Kumar K.V
ea59bb759b fs/9p: Open writeback fid in O_SYNC mode
Older version of protocol don't support tsyncfs operation.
So for them force a O_SYNC flag on the server

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 15:43:36 -05:00
Aneesh Kumar K.V
059c138bc7 fs/9p: Use truncate_setsize instead of vmtruncate
convert vmtruncate usage to truncate_setsize. We also writeback
all dirty pages before doing 9p operations and on success call truncate_setsize.
This ensure that we continue sanely on failed truncate on the server. The
disadvantage is that we are now going to write back the content that get
thrown away later as a part of truncate.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 15:43:35 -05:00
Aneesh Kumar K.V
472e7f9f8b net/9p: Fix compile warning
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 15:43:35 -05:00
Aneesh Kumar K.V
eeff66ef6e net/9p: Convert the in the 9p rpc call path to GFP_NOFS
Without this we can cause reclaim allocation in writepage.

[ 3433.448430] =================================
[ 3433.449117] [ INFO: inconsistent lock state ]
[ 3433.449117] 2.6.38-rc5+ #84
[ 3433.449117] ---------------------------------
[ 3433.449117] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
[ 3433.449117] kswapd0/505 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 3433.449117]  (iprune_sem){+++++-}, at: [<ffffffff810ebbab>] shrink_icache_memory+0x45/0x2b1
[ 3433.449117] {RECLAIM_FS-ON-W} state was registered at:
[ 3433.449117]   [<ffffffff8107fe5f>] mark_held_locks+0x52/0x70
[ 3433.449117]   [<ffffffff8107ff02>] lockdep_trace_alloc+0x85/0x9f
[ 3433.449117]   [<ffffffff810d353d>] slab_pre_alloc_hook+0x18/0x3c
[ 3433.449117]   [<ffffffff810d3fd5>] kmem_cache_alloc+0x23/0xa2
[ 3433.449117]   [<ffffffff8127be77>] idr_pre_get+0x2d/0x6f
[ 3433.449117]   [<ffffffff815434eb>] p9_idpool_get+0x30/0xae
[ 3433.449117]   [<ffffffff81540123>] p9_client_rpc+0xd7/0x9b0
[ 3433.449117]   [<ffffffff815427b0>] p9_client_clunk+0x88/0xdb
[ 3433.449117]   [<ffffffff811d56e5>] v9fs_evict_inode+0x3c/0x48
[ 3433.449117]   [<ffffffff810eb511>] evict+0x1f/0x87
[ 3433.449117]   [<ffffffff810eb5c0>] dispose_list+0x47/0xe3
[ 3433.449117]   [<ffffffff810eb8da>] evict_inodes+0x138/0x14f
[ 3433.449117]   [<ffffffff810d90e2>] generic_shutdown_super+0x57/0xe8
[ 3433.449117]   [<ffffffff810d91e8>] kill_anon_super+0x11/0x50
[ 3433.449117]   [<ffffffff811d4951>] v9fs_kill_super+0x49/0xab
[ 3433.449117]   [<ffffffff810d926e>] deactivate_locked_super+0x21/0x46
[ 3433.449117]   [<ffffffff810d9e84>] deactivate_super+0x40/0x44
[ 3433.449117]   [<ffffffff810ef848>] mntput_no_expire+0x100/0x109
[ 3433.449117]   [<ffffffff810f0aeb>] sys_umount+0x2f1/0x31c
[ 3433.449117]   [<ffffffff8102c87b>] system_call_fastpath+0x16/0x1b
[ 3433.449117] irq event stamp: 192941
[ 3433.449117] hardirqs last  enabled at (192941): [<ffffffff81568dcf>] _raw_spin_unlock_irq+0x2b/0x30
[ 3433.449117] hardirqs last disabled at (192940): [<ffffffff810b5f97>] shrink_inactive_list+0x290/0x2f5
[ 3433.449117] softirqs last  enabled at (188470): [<ffffffff8105fd65>] __do_softirq+0x133/0x152
[ 3433.449117] softirqs last disabled at (188455): [<ffffffff8102d7cc>] call_softirq+0x1c/0x28
[ 3433.449117]
[ 3433.449117] other info that might help us debug this:
[ 3433.449117] 1 lock held by kswapd0/505:
[ 3433.449117]  #0:  (shrinker_rwsem){++++..}, at: [<ffffffff810b52e2>] shrink_slab+0x38/0x15f
[ 3433.449117]
[ 3433.449117] stack backtrace:
[ 3433.449117] Pid: 505, comm: kswapd0 Not tainted 2.6.38-rc5+ #84
[ 3433.449117] Call Trace:
[ 3433.449117]  [<ffffffff8107fbce>] ? valid_state+0x17e/0x191
[ 3433.449117]  [<ffffffff81036896>] ? save_stack_trace+0x28/0x45
[ 3433.449117]  [<ffffffff81080426>] ? check_usage_forwards+0x0/0x87
[ 3433.449117]  [<ffffffff8107fcf4>] ? mark_lock+0x113/0x22c
[ 3433.449117]  [<ffffffff8108105f>] ? __lock_acquire+0x37a/0xcf7
[ 3433.449117]  [<ffffffff8107fc0e>] ? mark_lock+0x2d/0x22c
[ 3433.449117]  [<ffffffff81081077>] ? __lock_acquire+0x392/0xcf7
[ 3433.449117]  [<ffffffff810b14d2>] ? determine_dirtyable_memory+0x15/0x28
[ 3433.449117]  [<ffffffff81081a33>] ? lock_acquire+0x57/0x6d
[ 3433.449117]  [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117]  [<ffffffff81567d85>] ? down_read+0x47/0x5c
[ 3433.449117]  [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117]  [<ffffffff810ebbab>] ? shrink_icache_memory+0x45/0x2b1
[ 3433.449117]  [<ffffffff810b5385>] ? shrink_slab+0xdb/0x15f
[ 3433.449117]  [<ffffffff810b69bc>] ? kswapd+0x574/0x96a
[ 3433.449117]  [<ffffffff810b6448>] ? kswapd+0x0/0x96a
[ 3433.449117]  [<ffffffff810714e2>] ? kthread+0x7d/0x85
[ 3433.449117]  [<ffffffff8102d6d4>] ? kernel_thread_helper+0x4/0x10
[ 3433.449117]  [<ffffffff81569200>] ? restore_args+0x0/0x30
[ 3433.449117]  [<ffffffff81071465>] ? kthread+0x0/0x85
[ 3433.449117]  [<ffffffff8102d6d0>] ? kernel_thread_helper+0x0/0x10

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 15:43:35 -05:00
Aneesh Kumar K.V
5a7e0a8cf5 fs/9p: Fix race in initializing writeback fid
When two process open the same file we can end up with both of them
allocating the writeback_fid. Add a new mutex which can be used
for synchronizing v9fs_inode member values.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2011-03-22 15:43:35 -05:00
Christoph Lameter
4fdccdfbb4 slub: Add statistics for this_cmpxchg_double failures
Add some statistics for debugging.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-03-22 20:48:04 +02:00
Christoph Lameter
2fd66c517d slub: Add missing irq restore for the OOM path
OOM path is missing the irq restore in the CONFIG_CMPXCHG_LOCAL case.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-03-22 20:48:04 +02:00
Yehuda Sadeh
59c2be1e4d rbd: use watch/notify for changes in rbd header
Send notifications when we change the rbd header (e.g. create a snapshot)
and wait for such notifications.  This allows synchronizing the snapshot
creation between different rbd clients/rools.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-22 11:33:56 -07:00
Yehuda Sadeh
a40c4f10e3 libceph: add lingering request and watch/notify event framework
Lingering requests are requests that are sent to the OSD normally but
tracked also after we get a successful request.  This keeps the OSD
connection open and resends the original request if the object moves to
another OSD.  The OSD can then send notification messages back to us
if another client initiates a notify.

This framework will be used by RBD so that the client gets notification
when a snapshot is created by another node or tool.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-22 11:33:55 -07:00
Linus Torvalds
f741a79e98 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: make fuse_dentry_revalidate() RCU aware
  fuse: make fuse_permission() RCU aware
  fuse: wakeup pollers on connection release/abort
  fuse: reduce size of struct fuse_request
2011-03-22 10:42:43 -07:00
Linus Torvalds
73d5a8675f Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  xen: update mask_rw_pte after kernel page tables init changes
  xen: set max_pfn_mapped to the last pfn mapped
  x86: Cleanup highmap after brk is concluded

Fix up trivial onflict (added header file includes) in
arch/x86/mm/init_64.c
2011-03-22 10:41:36 -07:00
Linus Torvalds
e77277dfe2 Merge branch 'next-samsung' of git://git.fluff.org/bjdooks/linux
* 'next-samsung' of git://git.fluff.org/bjdooks/linux:
  ARM: H1940/RX1950: Change default LED triggers
  ARM: S3C2442: RX1950: Add support for LED blinking
  ARM: S3C2442: RX1950: Retain LEDs state in suspend
  ARM: S3C2410: H1940: Fix lcd_power_set function
  ARM: S3C2410: H1940: Add battery support
  ARM: S3C2410: H1940: Use leds-gpio driver for LEDs managing
  ARM: S3C2410: H1940: Make h1940-bluetooth.c compile again
  ARM: S3C2410: H1940: Add keys device
2011-03-22 10:06:54 -07:00
Linus Torvalds
75ea6358bc Merge branch 'for-linus/2639/i2c-2' of git://git.fluff.org/bjdooks/linux
* 'for-linus/2639/i2c-2' of git://git.fluff.org/bjdooks/linux:
  i2c-pxa2xx: Don't clear isr bits too early
  i2c-pxa2xx: Fix register offsets
  i2c-pxa2xx: pass of_node from platform driver to adapter and publish
  i2c-pxa2xx: check timeout correctly
  i2c-pxa2xx: add support for shared IRQ handler
  i2c-pxa2xx: Add PCI support for PXA I2C controller
  ARM: pxa2xx: reorganize I2C files
  i2c-pxa2xx: use dynamic register layout
  i2c-mxs: set controller to pio queue mode after reset
  i2c-eg20t: support new device OKI SEMICONDUCTOR ML7213 IOH
  i2c/busses: Add support for Diolan U2C-12 USB-I2C adapter
2011-03-22 10:05:27 -07:00
Linus Torvalds
14577beb82 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  slub: Dont define useless label in the !CONFIG_CMPXCHG_LOCAL case
  slab,rcu: don't assume the size of struct rcu_head
  slub,rcu: don't assume the size of struct rcu_head
  slub: automatically reserve bytes at the end of slab
  Lockless (and preemptless) fastpaths for slub
  slub: Get rid of slab_free_hook_irq()
  slub: min_partial needs to be in first cacheline
  slub: fix ksize() build error
  slub: fix kmemcheck calls to match ksize() hints
  Revert "slab: Fix missing DEBUG_SLAB last user"
  mm: Remove support for kmem_cache_name()
2011-03-22 09:36:23 -07:00
Martin K. Petersen
09b9cc44c9 sd: Fail discard requests when logical block provisioning has been disabled
Ensure that we kill discard requests after logical block provisioning
has been disabled in sysfs.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-22 09:35:53 -07:00
Linus Torvalds
c62b389863 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
  IPVS: Use global mutex in ip_vs_app.c
  ipvs: fix a typo in __ip_vs_control_init()
  veth: Fix the byte counters
  net ipv6: Fix duplicate /proc/sys/net/ipv6/neigh directory entries.
  macvlan: Fix use after free of struct macvlan_port.
  net: fix incorrect spelling in drop monitor protocol
  can: c_can: Do basic c_can configuration _before_ enabling the interrupts
  net/appletalk: fix atalk_release use after free
  ipx: fix ipx_release()
  snmp: SNMP_UPD_PO_STATS_BH() always called from softirq
  l2tp: fix possible oops on l2tp_eth module unload
  xfrm: Fix initialize repl field of struct xfrm_state
  netfilter: ipt_CLUSTERIP: fix buffer overflow
  netfilter: xtables: fix reentrancy
  netfilter: ipset: fix checking the type revision at create command
  netfilter: ipset: fix address ranges at hash:*port* types
  niu: Rename NIU parent platform device name to fix conflict.
  r8169: fix a bug in rtl8169_init_phy()
  bonding: fix a typo in a comment
  ftmac100: use resource_size()
  ...
2011-03-22 09:25:34 -07:00
Simon Horman
736561a01f IPVS: Use global mutex in ip_vs_app.c
As part of the work to make IPVS network namespace aware
__ip_vs_app_mutex was replaced by a per-namespace lock,
ipvs->app_mutex. ipvs->app_key is also supplied for debugging purposes.

Unfortunately this implementation results in ipvs->app_key residing
in non-static storage which at the very least causes a lockdep warning.

This patch takes the rather heavy-handed approach of reinstating
__ip_vs_app_mutex which will cover access to the ipvs->list_head
of all network namespaces.

[   12.610000] IPVS: Creating netns size=2456 id=0
[   12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
[   12.640000] BUG: key ffff880003bbf1a0 not in .data!
[   12.640000] ------------[ cut here ]------------
[   12.640000] WARNING: at kernel/lockdep.c:2701 lockdep_init_map+0x37b/0x570()
[   12.640000] Hardware name: Bochs
[   12.640000] Pid: 1, comm: swapper Tainted: G        W 2.6.38-kexec-06330-g69b7efe-dirty #122
[   12.650000] Call Trace:
[   12.650000]  [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0
[   12.650000]  [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20
[   12.650000]  [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570
[   12.650000]  [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10
[   12.650000]  [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50
[   12.650000]  [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70
[   12.650000]  [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff811b1c33>] T.620+0x43/0x170
[   12.660000]  [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.660000]  [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0
[   12.660000]  [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff
[   12.670000]  [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40
[   12.670000]  [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12
[   12.670000]  [<ffffffff81685a87>] ip_vs_init+0x4c/0xff
[   12.670000]  [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e
[   12.670000]  [<ffffffff8166583e>] kernel_init+0x13e/0x1c2
[   12.670000]  [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10
[   12.670000]  [<ffffffff8128ad40>] ? restore_args+0x0/0x30
[   12.680000]  [<ffffffff81665700>] ? kernel_init+0x0/0x1c2
[   12.680000]  [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x1global0

Signed-off-by: Simon Horman <horms@verge.net.au>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 20:39:24 -07:00
Eric Dumazet
f40f94fc6c ipvs: fix a typo in __ip_vs_control_init()
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 20:39:24 -07:00
Eric W. Biederman
675071a2ef veth: Fix the byte counters
Commit 44540960 "veth: move loopback logic to common location" introduced
a bug in the packet counters.  I don't understand why that happened as it
is not explained in the comments and the mut check in dev_forward_skb
retains the assumption that skb->len is the total length of the packet.

I just measured this emperically by setting up a veth pair between two
noop network namespaces setting and attempting a telnet connection between
the two.  I saw three packets in each direction and the byte counters were
exactly 14*3 = 42 bytes high in each direction.  I got the actual
packet lengths with tcpdump.

So remove the extra ETH_HLEN from the veth byte count totals.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:24:53 -07:00
Eric W. Biederman
9d2a8fa96a net ipv6: Fix duplicate /proc/sys/net/ipv6/neigh directory entries.
When I was fixing issues with unregisgtering tables under /proc/sys/net/ipv6/neigh
by adding a mount point it appears I missed a critical ordering issue, in the
ipv6 initialization.  I had not realized that ipv6_sysctl_register is called
at the very end of the ipv6 initialization and in particular after we call
neigh_sysctl_register from ndisc_init.

"neigh" needs to be initialized in ipv6_static_sysctl_register which is
the first ipv6 table to initialized, and definitely before ndisc_init.
This removes the weirdness of duplicate tables while still providing a
"neigh" mount point which prevents races in sysctl unregistering.

This was initially reported at https://bugzilla.kernel.org/show_bug.cgi?id=31232
Reported-by: sunkan@zappa.cx
Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:23:34 -07:00
Eric W. Biederman
d5cd92448f macvlan: Fix use after free of struct macvlan_port.
When the macvlan driver was extended to call unregisgter_netdevice_queue
in 23289a37e2, a use after free of struct
macvlan_port was introduced.  The code in dellink relied on unregister_netdevice
actually unregistering the net device so it would be safe to free macvlan_port.

Since unregister_netdevice_queue can just queue up the unregister instead of
performing the unregiser immediately we free the macvlan_port too soon and
then the code in macvlan_stop removes the macaddress for the set of macaddress
to listen for and uses memory that has already been freed.

To fix this add a reference count to track when it is safe to free the macvlan_port
and move the call of macvlan_port_destroy into macvlan_uninit which is guaranteed
to be called after the final macvlan_port_close.

Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:22:22 -07:00
Neil Horman
ac0a121d79 net: fix incorrect spelling in drop monitor protocol
It was pointed out to me recently that my spelling could be better :)

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:20:26 -07:00
Jan Altenberg
4f2d56c45f can: c_can: Do basic c_can configuration _before_ enabling the interrupts
I ran into some trouble while testing the SocketCAN driver for the BOSCH
C_CAN controller. The interface is not correctly initialized, if I put
some CAN traffic on the line, _while_ the interface is being started
(which means: the interface doesn't come up correcty, if there's some RX
traffic while doing 'ifconfig can0 up').

The current implementation enables the controller interrupts _before_
doing the basic c_can configuration. I think, this should be done the
other way round.

The patch below fixes things for me.

Signed-off-by: Jan Altenberg <jan@linutronix.de>
Acked-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:19:26 -07:00
Arnd Bergmann
b20e7bbfc7 net/appletalk: fix atalk_release use after free
The BKL removal in appletalk introduced a use-after-free problem,
where atalk_destroy_socket frees a sock, but we still release
the socket lock on it.

An easy fix is to take an extra reference on the sock and sock_put
it when returning from atalk_release.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:18:00 -07:00
Eric Dumazet
674f211599 ipx: fix ipx_release()
Commit b0d0d915d1 (remove the BKL) added a regression, because
sock_put() can free memory while we are going to use it later.

Fix is to delay sock_put() _after_ release_sock().

Reported-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:16:39 -07:00
Eric Dumazet
20246a8003 snmp: SNMP_UPD_PO_STATS_BH() always called from softirq
We dont need to test if we run from softirq context, we definitely are.

This saves few instructions in ip_rcv() & ip_rcv_finish()

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:12:54 -07:00
James Chapman
8aa525a934 l2tp: fix possible oops on l2tp_eth module unload
A struct used in the l2tp_eth driver for registering network namespace
ops was incorrectly marked as __net_initdata, leading to oops when
module unloaded.

BUG: unable to handle kernel paging request at ffffffffa00ec098
IP: [<ffffffff8123dbd8>] ops_exit_list+0x7/0x4b
PGD 142d067 PUD 1431063 PMD 195da8067 PTE 0
Oops: 0000 [#1] SMP 
last sysfs file: /sys/module/l2tp_eth/refcnt
Call Trace:
 [<ffffffff8123dc94>] ? unregister_pernet_operations+0x32/0x93
 [<ffffffff8123dd20>] ? unregister_pernet_device+0x2b/0x38
 [<ffffffff81068b6e>] ? sys_delete_module+0x1b8/0x222
 [<ffffffff810c7300>] ? do_munmap+0x254/0x318
 [<ffffffff812c64e5>] ? page_fault+0x25/0x30
 [<ffffffff812c6952>] ? system_call_fastpath+0x16/0x1b

Signed-off-by: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:10:25 -07:00
Wei Yongjun
a454f0ccef xfrm: Fix initialize repl field of struct xfrm_state
Commit 'xfrm: Move IPsec replay detection functions to a separate file'
  (9fdc4883d9)
introduce repl field to struct xfrm_state, and only initialize it
under SA's netlink create path, the other path, such as pf_key,
ipcomp/ipcomp6 etc, the repl field remaining uninitialize. So if
the SA is created by pf_key, any input packet with SA's encryption
algorithm will cause panic.

    int xfrm_input()
    {
        ...
        x->repl->advance(x, seq);
        ...
    }

This patch fixed it by introduce new function __xfrm_init_state().

Pid: 0, comm: swapper Not tainted 2.6.38-next+ #14 Bochs Bochs
EIP: 0060:[<c078e5d5>] EFLAGS: 00010206 CPU: 0
EIP is at xfrm_input+0x31c/0x4cc
EAX: dd839c00 EBX: 00000084 ECX: 00000000 EDX: 01000000
ESI: dd839c00 EDI: de3a0780 EBP: dec1de88 ESP: dec1de64
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process swapper (pid: 0, ti=dec1c000 task=c09c0f20 task.ti=c0992000)
Stack:
 00000000 00000000 00000002 c0ba27c0 00100000 01000000 de3a0798 c0ba27c0
 00000033 dec1de98 c0786848 00000000 de3a0780 dec1dea4 c0786868 00000000
 dec1debc c074ee56 e1da6b8c de3a0780 c074ed44 de3a07a8 dec1decc c074ef32
Call Trace:
 [<c0786848>] xfrm4_rcv_encap+0x22/0x27
 [<c0786868>] xfrm4_rcv+0x1b/0x1d
 [<c074ee56>] ip_local_deliver_finish+0x112/0x1b1
 [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1
 [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44
 [<c074ef77>] ip_local_deliver+0x3e/0x44
 [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1
 [<c074ec03>] ip_rcv_finish+0x30a/0x332
 [<c074e8f9>] ? ip_rcv_finish+0x0/0x332
 [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44
 [<c074f188>] ip_rcv+0x20b/0x247
 [<c074e8f9>] ? ip_rcv_finish+0x0/0x332
 [<c072797d>] __netif_receive_skb+0x373/0x399
 [<c0727bc1>] netif_receive_skb+0x4b/0x51
 [<e0817e2a>] cp_rx_poll+0x210/0x2c4 [8139cp]
 [<c072818f>] net_rx_action+0x9a/0x17d
 [<c0445b5c>] __do_softirq+0xa1/0x149
 [<c0445abb>] ? __do_softirq+0x0/0x149

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-03-21 18:08:28 -07:00
Vasily Khoruzhick
9edb240696 ARM: H1940/RX1950: Change default LED triggers
Change LED triggers to mimic WinMobile behavior:
red blinking when battery is charging,
orange solid when battery is charged.

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2011-03-21 23:29:43 +00:00
Vasily Khoruzhick
97491ba3f6 i2c-pxa2xx: Don't clear isr bits too early
isr is passed later into i2c_pxa_irq_txempty and
i2c_pxa_irq_rxfull and they may use some other bits
than irq sources.

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
2011-03-21 23:00:12 +00:00
Ben Dooks
a0774f4511 Merge branches 'for-2639/i2c/i2c-ce4100-v6', 'for-2639/i2c/i2c-eg20t-v3' and 'for-2639/i2c/i2c-imx' into for-linus/2639/i2c-2 2011-03-21 22:57:25 +00:00
Linus Torvalds
eddecbb601 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  kbuild: Make DEBUG_SECTION_MISMATCH selectable, but not on by default
  genksyms: Regenerate lexer and parser
  genksyms: Track changes to enum constants
  genksyms: simplify usage of find_symbol()
  genksyms: Add helpers for building string lists
  genksyms: Simplify printing of symbol types
  genksyms: Simplify lexer
  genksyms: Do not paste the bison header file to lex.c
  modpost: fix trailing comma
  KBuild: silence "'scripts/unifdef' is up to date."
  kbuild: Add extra gcc checks
  kbuild: reenable section mismatch analysis
  unifdef: update to upstream version 2.5
2011-03-21 15:55:26 -07:00
Jesper Juhl
0bf8c86970 Reduce sequential pointer derefs in scsi_error.c and reduce size as well
This patch reduces the number of sequential pointer derefs in
drivers/scsi/scsi_error.c

This has been submitted a number of times over a couple of years.  I
believe this version adresses all comments it has gathered over time.
Please apply or reject with a reason.

The benefits are:

 - makes the code easier to read.  Lots of sequential derefs of the same
   pointers is not easy on the eye.

 - theoretically at least, just dereferencing the pointers once can
   allow the compiler to generally slightly faster code, so in theory
   this could also be a micro speed optimization.

 - reduces size of object file (tiny effect: on x86-64, in at least one
   configuration, the text size decreased from 9439 bytes to 9400)

 - removes some pointless (mostly trailing) whitespace.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-21 15:54:35 -07:00
Gary Hade
38f7aa23c4 matroxfb: remove incorrect Matrox G200eV support
Remove incorrect Matrox G200eV support that was previously added by
commit e3a1938805

A serious issue with the incorrect G200eV support that reproduces on the
Matrox G200eV equipped IBM x3650 M2 is the total lack of text (login
banner, login prompt, etc) on the console when X is not running and
total lack of text on all of the virtual consoles after X is started.

Any concerns that the incorrect code (upstream since October 2008) has
been successfully used on non-IBM G200eV equipped system(s) appear to be
unwarranted.  In addition to the serious/non-intermittent nature of
issues that have been spotted on IBM systems, complete removal of the
incorrect code is clearly supported by the following Matrox (Yannick
Heneault) provided input:
 "It impossible that this patch should have work on a system.
 The patch only declare the G200eV as a regular G200 which is
 not case. Many registers are different, including at least the
 PLL programming sequence. If the G200eV is programmed like a
 regular G200, it will not display anything."

v1 - Initial patch that removed the incorrect code for _all_
     G200eV equipped systems.
v2 - Darrick Wong provided patch that blacklisted the incorrect
     code on G200eV equipped IBM systems leaving it enabled on
     all G200eV equipped non-IBM systems.
v3 - Same code changes included with v1 plus additional
     justification for complete removal of the incorrect code.

Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Cc: Darrick J. Wong <djwong@us.ibm.com>
Cc: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Petr Vandrovec <vandrove@vc.cvut.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yannick Heneault <yannick_heneault@matrox.com>
Cc: Christian Toutant <ctoutant@matrox.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-21 15:50:30 -07:00
Sage Weil
55b00bae11 rbd: update email address in Documentation
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-21 15:06:50 -07:00
Linus Torvalds
3155fe6df5 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs: (23 commits)
  xfs: don't name variables "panic"
  xfs: factor agf counter updates into a helper
  xfs: clean up the xfs_alloc_compute_aligned calling convention
  xfs: kill support/debug.[ch]
  xfs: Convert remaining cmn_err() callers to new API
  xfs: convert the quota debug prints to new API
  xfs: rename xfs_cmn_err_fsblock_zero()
  xfs: convert xfs_fs_cmn_err to new error logging API
  xfs: kill xfs_fs_mount_cmn_err() macro
  xfs: kill xfs_fs_repair_cmn_err() macro
  xfs: convert xfs_cmn_err to xfs_alert_tag
  xfs: Convert xlog_warn to new logging interface
  xfs: Convert linux-2.6/ files to new logging interface
  xfs: introduce new logging API.
  xfs: zero proper structure size for geometry calls
  xfs: enable delaylog by default
  xfs: more sensible inode refcounting for ialloc
  xfs: stop using xfs_trans_iget in the RT allocator
  xfs: check if device support discard in xfs_ioc_trim()
  xfs: prevent leaking uninitialized stack memory in FSGEOMETRY_V1
  ...
2011-03-21 14:24:56 -07:00
Julien Tinnes
da48524eb2 Prevent rt_sigqueueinfo and rt_tgsigqueueinfo from spoofing the signal code
Userland should be able to trust the pid and uid of the sender of a
signal if the si_code is SI_TKILL.

Unfortunately, the kernel has historically allowed sigqueueinfo() to
send any si_code at all (as long as it was negative - to distinguish it
from kernel-generated signals like SIGILL etc), so it could spoof a
SI_TKILL with incorrect siginfo values.

Happily, it looks like glibc has always set si_code to the appropriate
SI_QUEUE, so there are probably no actual user code that ever uses
anything but the appropriate SI_QUEUE flag.

So just tighten the check for si_code (we used to allow any negative
value), and add a (one-time) warning in case there are binaries out
there that might depend on using other si_code values.

Signed-off-by: Julien Tinnes <jln@google.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-21 14:23:43 -07:00
Linus Torvalds
b52307ca14 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest:
  ktest: Add STOP_TEST_AFTER to stop the test after a period of time
  ktest: Monitor kernel while running of user tests
  ktest: Fix bug where the test would not end after failure
  ktest: Add BISECT_FILES to run git bisect on paths
  ktest: Add BISECT_SKIP
  ktest: Add manual bisect
  ktest: Handle kernels before make oldnoconfig
  ktest: Start failure timeout on panic too
  ktest: Print logfile name on failure
2011-03-21 14:13:48 -07:00
Linus Torvalds
afd8c40431 Merge branch 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
* 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  hwmon: (ads1015) Make gain and datarate configurable
  hwmon: (ads1015) Drop dynamic attribute group
  hwmon: Add support for Texas Instruments ADS1015
  hwmon: New driver for SMSC SCH5627
  hwmon: (abituguru*) Update my email address
  hwmon: (lm75) Speed up detection
  hwmon: (lm75) Add detection of the National Semiconductor LM75A
  hp_accel: Fix driver name
  Move lis3lv02d drivers to drivers/misc
  Move hp_accel to drivers/platform/x86
  Let Kconfig handle lis3lv02d dependencies
  hwmon: (sht15) Fix integer overflow in humidity calculation
  hwmon: (sht15) Spelling fix
  hwmon: (w83795) Document pin mapping
2011-03-21 14:02:55 -07:00
Luck, Tony
366f7e7a79 pstore: use mount option instead sysfs to tweak kmsg_bytes
/sys/fs is a somewhat strange way to tweak what could more
obviously be tuned with a mount option.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-21 13:50:05 -07:00
Sage Weil
147851d2dc ceph: rename dentry_release -> d_release, fix comment
Just for consistency's sake.  Fix obsolete comment too.

Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-21 12:24:26 -07:00
Henry C Chang
49bcb93236 ceph: add request to the tail of unsafe write list
In sync_write_wait(), we assume that the newest request is at the
tail of unsafe write list. We should maintain the semantics here.

Signed-off-by: Henry C Chang <henry_c_chang@tcloudcomputing.com>
Signed-off-by: Sage Weil <sage@newdream.net>
2011-03-21 12:24:25 -07:00