Commit Graph

37800 Commits

Author SHA1 Message Date
Mauro Carvalho Chehab
945cdfa2c9 V4L/DVB: ir: use a real device instead of a virtual class
Change the ir-sysfs approach to create irrcv0 as a device, instead of
using class_dev. Also, change the way input is registered, in order
to make its parent to be the irrcv device.

Due to this change, now the event device is created under
/sys/class/ir/irrcv class:

/sys/class/irrcv/irrcv0/
|-- current_protocol
|-- device -> ../../../1-3
|-- input9
|   |-- capabilities
|   |   |-- abs
...

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18 00:47:03 -03:00
Mauro Carvalho Chehab
4f9fb5ed02 V4L/DVB: soc-camera: add runtime pm support for subdevices
To save power soc-camera powers subdevices down, when they are not in use,
if this is supported by the platform. However, the V4L standard dictates,
that video nodes shall preserve configuration between uses. This requires
runtime power management, which is implemented by this patch. It allows
subdevice drivers to specify their runtime power-management methods, by
assigning a type to the video device.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18 00:46:09 -03:00
Márton Németh
e26b314491 V4L/DVB: The first two parameters of soc_camera_limit_side() are usually pointers to struct v4l2_rect elements. They are signed, so adjust the prototype accordingly
This will remove the following sparse warning (see "make C=1"):

 * incorrect type in argument 1 (different signedness)
       expected unsigned int *start
       got signed int *<noident>

as well as a couple more signedness mismatches.

Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18 00:44:09 -03:00
Mauro Carvalho Chehab
9701dc94a1 V4L/DVB (12770): Add tm6000 driver to staging tree
Adds a driver for Trident TV Master tm5600/tm6000 chips.

Those USB devices are usually found with a Xceive xc2028/xc3028
tuner, although the firmware seems to be modified to work with
those chips on some older devices.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2010-05-18 00:39:25 -03:00
Tejun Heo
9f2f72107f libata-sff: reorder SFF/BMDMA functions
Reorder functions such that SFF and BMDMA functions are grouped.
While at it, s/BMDMA/SFF in a few comments where it actually meant
SFF.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2010-05-17 22:49:07 -04:00
Tejun Heo
3e4ec3443f libata: kill ATA_FLAG_DISABLED
ATA_FLAG_DISABLED is only used by drivers which don't use
->error_handler framework and is largely broken.  Its only meaningful
function is to make irq handlers skip processing if the flag is set,
which is largely useless and even harmful as it makes those ports more
likely to cause IRQ storms.

Kill ATA_FLAG_DISABLED and makes the callers disable attached devices
instead.  ata_port_probe() and ata_port_disable() which manipulate the
flag are also killed.

This simplifies condition check in IRQ handlers.  While updating IRQ
handlers, remove ap NULL check as libata guarantees consecutive port
allocation (unoccupied ports are initialized with dummies) and
long-obsolete ATA_QCFLAG_ACTIVE check (checked by ata_qc_from_tag()).

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2010-05-17 22:49:02 -04:00
andrew hendry
37cda78741 X25: Move accept approve flag to bitfield
Moves the x25 accept approve flag from char into bitfield.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:39:27 -07:00
andrew hendry
b7792e34cb X25: Move interrupt flag to bitfield
Moves the x25 interrupt flag from char into bitfield.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:39:27 -07:00
andrew hendry
cb863ffd4a X25: Move qbit flag to bitfield
Moves the X25 q bit flag from char into a bitfield to allow BKL cleanup.

Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:39:26 -07:00
Eric Dumazet
407eadd996 net: implements ip_route_input_noref()
ip_route_input() is the version returning a refcounted dst, while
ip_route_input_noref() returns a non refcounted one.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:18:51 -07:00
Eric Dumazet
7fee226ad2 net: add a noref bit on skb dst
Use low order bit of skb->_skb_dst to tell dst is not refcounted.

Change _skb_dst to _skb_refdst to make sure all uses are catched.

skb_dst() returns the dst, regardless of noref bit set or not, but
with a lockdep check to make sure a noref dst is not given if current
user is not rcu protected.

New skb_dst_set_noref() helper to set an notrefcounted dst on a skb.
(with lockdep check)

skb_dst_drop() drops a reference only if skb dst was refcounted.

skb_dst_force() helper is used to force a refcount on dst, when skb
is queued and not anymore RCU protected.

Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if
!IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in
sock_queue_rcv_skb(), in __nf_queue().

Use skb_dst_force() in dev_requeue_skb().

Note: dst_use_noref() still dirties dst, we might transform it
later to do one dirtying per jiffies.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-17 17:18:50 -07:00
Dan Williams
0b28330e39 Merge branch 'ioat' into dmaengine 2010-05-17 16:30:58 -07:00
Linus Walleij
058276303d DMAENGINE: extend the control command to include an arg
This adds an argument to the DMAengine control function, so that
we can later provide control commands that need some external data
passed in through an argument akin to the ioctl() operation
prototype.

[dan.j.williams@intel.com: fix up some missed conversions]
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-17 16:30:42 -07:00
Philipp Reisner
67c7ddd055 drbd: Four new configuration settings for resync speed control
To reasonably control resync speed over drbd-proxy connections,
drbd has to measure the current delay of packets transmitted over
the (possibly congested) data socket vs the meta-data socket.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:25:00 +02:00
Dan Williams
caa20d974c async_tx: trim dma_async_tx_descriptor in 'no channel switch' case
Saves 24 bytes per descriptor (64-bit) when the channel-switching
capabilities of async_tx are not required.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2010-05-17 16:24:16 -07:00
Philipp Reisner
6495d2c6d0 drbd: Implemented the --assume-clean option for drbdsetup resize
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2010-05-18 01:17:47 +02:00
James Morris
539c99fd7f Merge branch 'next' into for-linus 2010-05-18 08:57:00 +10:00
Linus Torvalds
ba2e1c5f25 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: amiga - Floppy platform device conversion
  m68k: amiga - Sound platform device conversion
  m68k: amiga - Frame buffer platform device conversion
  m68k: amiga - Zorro host bridge platform device conversion
  m68k: amiga - Zorro bus modalias support
  platform: Make platform resource input parameters const
  m68k: invoke oom-killer from page fault
  serial167: Kill unused variables
  m68k: Implement generic_find_next_{zero_,}le_bit()
  m68k: hp300 - Checkpatch cleanup
  m68k: Remove trailing spaces in messages
  m68k: Simplify param.h by using <asm-generic/param.h>
  m68k: Remove BKL from rtc implementations
2010-05-17 13:54:29 -07:00
Geert Uytterhoeven
0d305464ae m68k: amiga - Zorro host bridge platform device conversion
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2010-05-17 21:37:42 +02:00
Geert Uytterhoeven
bf54a2b3c0 m68k: amiga - Zorro bus modalias support
Add Amiga Zorro bus modalias and uevent support

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
2010-05-17 21:37:41 +02:00
Geert Uytterhoeven
0b7f1a7efb platform: Make platform resource input parameters const
Make the platform resource input parameters of platform_device_add_resources()
and platform_device_register_simple() const, as the resources are copied and
never modified.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-17 21:37:40 +02:00
John W. Linville
6fe70aae0d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2010-05-17 13:57:43 -04:00
Russell King
ac1d426e82 Merge branch 'devel-stable' into devel
Conflicts:
	arch/arm/Kconfig
	arch/arm/include/asm/system.h
	arch/arm/mm/Kconfig
2010-05-17 17:24:04 +01:00
Anton Blanchard
81880d603d atomic_t: Remove volatile from atomic_t definition
When looking at a performance problem on PowerPC, I noticed some awful code
generation:

c00000000051fc98:       3b 60 00 01     li      r27,1
...
c00000000051fca0:       3b 80 00 00     li      r28,0
...
c00000000051fcdc:       93 61 00 70     stw     r27,112(r1)
c00000000051fce0:       93 81 00 74     stw     r28,116(r1)
c00000000051fce4:       81 21 00 70     lwz     r9,112(r1)
c00000000051fce8:       80 01 00 74     lwz     r0,116(r1)
c00000000051fcec:       7d 29 07 b4     extsw   r9,r9
c00000000051fcf0:       7c 00 07 b4     extsw   r0,r0

c00000000051fcf4:       7c 20 04 ac     lwsync
c00000000051fcf8:       7d 60 f8 28     lwarx   r11,0,r31
c00000000051fcfc:       7c 0b 48 00     cmpw    r11,r9
c00000000051fd00:       40 c2 00 10     bne-    c00000000051fd10
c00000000051fd04:       7c 00 f9 2d     stwcx.  r0,0,r31
c00000000051fd08:       40 c2 ff f0     bne+    c00000000051fcf8
c00000000051fd0c:       4c 00 01 2c     isync

We create two constants, write them out to the stack, read them straight back
in and sign extend them. What a mess.

It turns out this bad code is a result of us defining atomic_t as a
volatile int.

We removed the volatile attribute from the powerpc atomic_t definition years
ago, but commit ea43546750 (atomic_t: unify all
arch definitions) added it back in.

To dig up an old quote from Linus:

> The fact is, volatile on data structures is a bug. It's a wart in the C
> language. It shouldn't be used.
>
> Volatile accesses in *code* can be ok, and if we have "atomic_read()"
> expand to a "*(volatile int *)&(x)->value", then I'd be ok with that.
>
> But marking data structures volatile just makes the compiler screw up
> totally, and makes code for initialization sequences etc much worse.

And screw up it does :)

With the volatile removed, we see much more reasonable code generation:

c00000000051f5b8:       3b 60 00 01     li      r27,1
...
c00000000051f5c0:       3b 80 00 00     li      r28,0
...

c00000000051fc7c:       7c 20 04 ac     lwsync
c00000000051fc80:       7c 00 f8 28     lwarx   r0,0,r31
c00000000051fc84:       7c 00 d8 00     cmpw    r0,r27
c00000000051fc88:       40 c2 00 10     bne-    c00000000051fc98
c00000000051fc8c:       7f 80 f9 2d     stwcx.  r28,0,r31
c00000000051fc90:       40 c2 ff f0     bne+    c00000000051fc80
c00000000051fc94:       4c 00 01 2c     isync

Six instructions less.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-17 07:57:27 -07:00
Anton Blanchard
f3d46f9d31 atomic_t: Cast to volatile when accessing atomic variables
In preparation for removing volatile from the atomic_t definition, this
patch adds a volatile cast to all the atomic read functions.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-17 07:57:27 -07:00
Jens Axboe
e913fc825d writeback: fix WB_SYNC_NONE writeback from umount
When umount calls sync_filesystem(), we first do a WB_SYNC_NONE
writeback to kick off writeback of pending dirty inodes, then follow
that up with a WB_SYNC_ALL to wait for it. Since umount already holds
the sb s_umount mutex, WB_SYNC_NONE ends up doing nothing and all
writeback happens as WB_SYNC_ALL. This can greatly slow down umount,
since WB_SYNC_ALL writeback is a data integrity operation and thus
a bigger hammer than simple WB_SYNC_NONE. For barrier aware file systems
it's a lot slower.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-17 12:55:07 +02:00
Avi Kivity
8d3b932309 Merge remote branch 'tip/perf/core'
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:19:15 +03:00
Gui Jianfeng
2a059bf444 KVM: Get rid of dead function gva_to_page()
Nobody use gva_to_page() anymore, get rid of it.

Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:18:10 +03:00
Lai Jiangshan
90d83dc3d4 KVM: use the correct RCU API for PROVE_RCU=y
The RCU/SRCU API have already changed for proving RCU usage.

I got the following dmesg when PROVE_RCU=y because we used incorrect API.
This patch coverts rcu_deference() to srcu_dereference() or family API.

===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
arch/x86/kvm/mmu.c:3020 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
2 locks held by qemu-system-x86/8550:
 #0:  (&kvm->slots_lock){+.+.+.}, at: [<ffffffffa011a6ac>] kvm_set_memory_region+0x29/0x50 [kvm]
 #1:  (&(&kvm->mmu_lock)->rlock){+.+...}, at: [<ffffffffa012262d>] kvm_arch_commit_memory_region+0xa6/0xe2 [kvm]

stack backtrace:
Pid: 8550, comm: qemu-system-x86 Not tainted 2.6.34-rc4-tip-01028-g939eab1 #27
Call Trace:
 [<ffffffff8106c59e>] lockdep_rcu_dereference+0xaa/0xb3
 [<ffffffffa012f6c1>] kvm_mmu_calculate_mmu_pages+0x44/0x7d [kvm]
 [<ffffffffa012263e>] kvm_arch_commit_memory_region+0xb7/0xe2 [kvm]
 [<ffffffffa011a5d7>] __kvm_set_memory_region+0x636/0x6e2 [kvm]
 [<ffffffffa011a6ba>] kvm_set_memory_region+0x37/0x50 [kvm]
 [<ffffffffa015e956>] vmx_set_tss_addr+0x46/0x5a [kvm_intel]
 [<ffffffffa0126592>] kvm_arch_vm_ioctl+0x17a/0xcf8 [kvm]
 [<ffffffff810a8692>] ? unlock_page+0x27/0x2c
 [<ffffffff810bf879>] ? __do_fault+0x3a9/0x3e1
 [<ffffffffa011b12f>] kvm_vm_ioctl+0x364/0x38d [kvm]
 [<ffffffff81060cfa>] ? up_read+0x23/0x3d
 [<ffffffff810f3587>] vfs_ioctl+0x32/0xa6
 [<ffffffff810f3b19>] do_vfs_ioctl+0x495/0x4db
 [<ffffffff810e6b2f>] ? fget_light+0xc2/0x241
 [<ffffffff810e416c>] ? do_sys_open+0x104/0x116
 [<ffffffff81382d6d>] ? retint_swapgs+0xe/0x13
 [<ffffffff810f3ba6>] sys_ioctl+0x47/0x6a
 [<ffffffff810021db>] system_call_fastpath+0x16/0x1b

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:18:01 +03:00
Avi Kivity
9beeaa2d68 Merge branch 'perf'
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:58 +03:00
Takuya Yoshikawa
660c22c425 KVM: limit the number of pages per memory slot
This patch limits the number of pages per memory slot to make
us free from extra care about type issues.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-05-17 12:17:41 +03:00
Alexander Graf
ad0a048b09 KVM: PPC: Add OSI hypercall interface
MOL uses its own hypercall interface to call back into userspace when
the guest wants to do something.

So let's implement that as an exit reason, specify it with a CAP and
only really use it when userspace wants us to.

The only user of it so far is MOL.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:10 +03:00
Alexander Graf
71fbfd5f38 KVM: Add support for enabling capabilities per-vcpu
Some times we don't want all capabilities to be available to all
our vcpus. One example for that is the OSI interface, implemented
in the next patch.

In order to have a generic mechanism in how to enable capabilities
individually, this patch introduces a new ioctl that can be used
for this purpose. That way features we don't want in all guests or
userspace configurations can just not be enabled and we're good.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:17:09 +03:00
Alexander Graf
18978768d8 KVM: PPC: Allow userspace to unset the IRQ line
Userspace can tell us that it wants to trigger an interrupt. But
so far it can't tell us that it wants to stop triggering one.

So let's interpret the parameter to the ioctl that we have anyways
to tell us if we want to raise or lower the interrupt line.

Signed-off-by: Alexander Graf <agraf@suse.de>

v2 -> v3:

 - Add CAP for unset irq
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:16:51 +03:00
Xiao Guangrong
2ed152afc7 KVM: cleanup kvm trace
This patch does:

 - no need call tracepoint_synchronize_unregister() when kvm module
   is unloaded since ftrace can handle it

 - cleanup ftrace's macro

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-05-17 12:15:22 +03:00
Martin Schwidefsky
86f2552bbd [S390] add breaking event address for user space
Copy the last breaking event address from the lowcore to a new
field in the thread_struct on each system entry. Add a new
ptrace request PTRACE_GET_LAST_BREAK and a new utrace regset
REGSET_LAST_BREAK to query the last breaking event.

This is useful for debugging wild branches in user space code.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-05-17 10:00:15 +02:00
Li Zefan
f084db932e tracing: Convert more ext4 events to DEFINE_EVENT
Use DECLARE_EVENT_CLASS, and save ~2.7K:

   text    data     bss     dec     hex filename
 274441    7200     260  281901   44d2d fs/ext4/ext4.o.orig
 271881    7040     256  279177   44289 fs/ext4/ext4.o

4 events are converted:

  ext4__mb_new_pa: ext4_mb_new_inode_pa, ext4_mb_new_group_pa
  ext4__mballoc:   ext4_mballoc_discard, ext4_mballoc_free

No change in functionality.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-05-17 04:00:00 -04:00
Theodore Ts'o
f307333e14 ext4: Add new tracepoints to track mballoc's buddy bitmap loads
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-05-17 03:00:00 -04:00
David S. Miller
6811d58fc1 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	include/linux/if_link.h
2010-05-16 22:26:58 -07:00
John Kacur
93d84b6d99 ncpfs: BKL ioctl pushdown
Convert ncp_ioctl to an unlocked_ioctl and push down the bkl into it.

Signed-off-by: John Kacur <jkacur@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Petr Vandrovec <vandrove@vc.cvut.cz>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-05-17 05:27:42 +02:00
Robert Love
7b2787ec15 [SCSI] libfc: Move the port_id into lport
This patch creates a port_id member in struct fc_lport.
This allows libfc to just deal with fc_lport instances
instead of calling into the fc_host to get the port_id.

This change helps in only using symbols necessary for
operation from the libfc structures. libfc still needs
to change the fc_host_port_id() if the port_id changes
so the presentation layer (scsi_transport_fc) can provide
the user with the correct value, but libfc shouldn't
rely on the presentation layer for operational values.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-05-16 22:22:34 -04:00
Robert Love
1b80e0f91c [SCSI] libfc: Remove unused fc_get_host_port_type
Remove this unused routine.

Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-05-16 22:22:29 -04:00
Linus Torvalds
b5dbc85871 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  rtnetlink: make SR-IOV VF interface symmetric
  sctp: delete active ICMP proto unreachable timer when free transport
  tcp: fix MD5 (RFC2385) support
2010-05-16 11:11:53 -07:00
Eric Sandeen
0e05842bc1 quota: add the option to not fail with EDQUOT in block
To simplify metadata tracking for delalloc writes, ext4
will simply claim metadata blocks at allocation time, without
first speculatively reserving the worst case and then freeing
what was not used.

To do this, we need a mechanism to track allocations in
the quota subsystem, but potentially allow that allocation
to actually go over quota.

This patch adds a DQUOT_SPACE_NOFAIL flag and function
variants for this purpose.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-05-16 10:00:00 -04:00
Eric Sandeen
56246f9ae4 quota: use flags interface for dquot alloc/free space
Switch __dquot_alloc_space and __dquot_free_space to take flags
to indicate whether to warn and/or to reserve (or free reserve).

This is slightly more readable at the callpoints, and makes it
cleaner to add a "nofail" option in the next patch.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-05-16 09:00:00 -04:00
Chris Wright
c02db8c629 rtnetlink: make SR-IOV VF interface symmetric
Now we have a set of nested attributes:

  IFLA_VFINFO_LIST (NESTED)
    IFLA_VF_INFO (NESTED)
      IFLA_VF_MAC
      IFLA_VF_VLAN
      IFLA_VF_TX_RATE

This allows a single set to operate on multiple attributes if desired.
Among other things, it means a dump can be replayed to set state.

The current interface has yet to be released, so this seems like
something to consider for 2.6.34.

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16 01:05:45 -07:00
Eric Dumazet
a465419b1f net: Introduce sk_route_nocaps
TCP-MD5 sessions have intermittent failures, when route cache is
invalidated. ip_queue_xmit() has to find a new route, calls
sk_setup_caps(sk, &rt->u.dst), destroying the 

sk->sk_route_caps &= ~NETIF_F_GSO_MASK

that MD5 desperately try to make all over its way (from
tcp_transmit_skb() for example)

So we send few bad packets, and everything is fine when
tcp_transmit_skb() is called again for this socket.

Since ip_queue_xmit() is at a lower level than TCP-MD5, I chose to use a
socket field, sk_route_nocaps, containing bits to mask on sk_route_caps.

Reported-by: Bhaskar Dutta <bhaskie@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16 00:36:33 -07:00
Eric Dumazet
35790c0421 tcp: fix MD5 (RFC2385) support
TCP MD5 support uses percpu data for temporary storage. It currently
disables preemption so that same storage cannot be reclaimed by another
thread on same cpu.

We also have to make sure a softirq handler wont try to use also same
context. Various bug reports demonstrated corruptions.

Fix is to disable preemption and BH.

Reported-by: Bhaskar Dutta <bhaskie@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-16 00:34:04 -07:00
Eric Dumazet
3b098e2d7c net: Consistent skb timestamping
With RPS inclusion, skb timestamping is not consistent in RX path.

If netif_receive_skb() is used, its deferred after RPS dispatch.

If netif_rx() is used, its done before RPS dispatch.

This can give strange tcpdump timestamps results.

I think timestamping should be done as soon as possible in the receive
path, to get meaningful values (ie timestamps taken at the time packet
was delivered by NIC driver to our stack), even if NAPI already can
defer timestamping a bit (RPS can help to reduce the gap)

Tom Herbert prefer to sample timestamps after RPS dispatch. In case
sampling is expensive (HPET/acpi_pm on x86), this makes sense.

Let admins switch from one mode to another, using a new
sysctl, /proc/sys/net/core/netdev_tstamp_prequeue

Its default value (1), means timestamps are taken as soon as possible,
before backlog queueing, giving accurate timestamps.

Setting a 0 value permits to sample timestamps when processing backlog,
after RPS dispatch, to lower the load of the pre-RPS cpu.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15 23:57:10 -07:00
Jiri Pirko
a14462f1bd net: adjust handle_macvlan to pass port struct to hook
Now there's null check here and also again in the hook. Looking at bridge bits
which are simmilar, port structure is rcu_dereferenced right away in
handle_bridge and passed to hook. Looks nicer.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-05-15 23:48:02 -07:00