Convert to netlink helpers by using netlink policy validation.
As a side effect fixes a leak.
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are functions to refer to the value of dst->metric[THE_METRIC-1]
directly without use of a inline function "dst_metric" defined in
net/dst.h.
The following patch changes them to use the inline function
consistently.
Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The list_del happens under read-locked devs_lock.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both br2684_push and br2684_exit do so, but unregister_netdev()
releases the device itself.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The error path in ieee80211_register_hw() may call the unregister_netdev()
and right after it - the free_netdev(), which is wrong, since the
unregister releases the device itself.
So the proposed fix is to NULL the local->mdev after unregister is done
and check this before calling free_netdev().
I checked - no code uses the local->mdev after unregister in this error
path (but even if some did this would be a BUG).
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This actually had to be merged with the patch #1, but I decided not to
mix two changes in one patch.
There are already two calls to free_netdev() in there, so merge them
into one.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case the register_netdevice() call fails the device is leaked,
since the out: label is just rtnl_unlock()+return.
Free the device.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
include/linux/skbuff.h says:
/* These elements must be at the end, see alloc_skb() for details. */
net/core/skbuff.c says:
* See comment in sk_buff definition, just before the 'tail' member
This patch contains my guess as to the actual reason rather than a
dead comment reference loop.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is lack of removing a class from the event queue while changing
from parent to leaf which can cause corruption of this rb tree. This
patch fixes a bug introduced by my patch: "sch_htb: turn intermediate
classes into leaves" commit: 160d5e10f8.
Many thanks to Jan 'yanek' Bortl for finding a way to reproduce this
rare bug and narrowing the test case, which made possible proper
diagnosing.
This patch is recommended for all kernels starting from 2.6.20.
Reported-and-tested-by: Jan 'yanek' Bortl <yanek@ya.bofh.cz>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
rose: Wrong list_lock argument in rose_node seqops
netns: Fix reassembly timer to use the right namespace
netns: Fix device renaming for sysfs
bnx2: Update version to 1.7.5.
bnx2: Update RV2P firmware for 5709.
bnx2: Zero out context memory for 5709.
bnx2: Fix register test on 5709.
bnx2: Fix remote PHY initial link state.
bnx2: Refine remote PHY locking.
bridge: forwarding table information for >256 devices
tg3: Update version to 3.92
tg3: Add link state reporting to UMP firmware
tg3: Fix ethtool loopback test for 5761 BX devices
tg3: Fix 5761 NVRAM sizes
tg3: Use constant 500KHz MI clock on adapters with a CPMU
hci_usb.h: fix hard-to-trigger race
dccp: ccid2.c, ccid3.c use clamp(), clamp_t()
net: remove NR_CPUS arrays in net/core/dev.c
net: use get/put_unaligned_* helpers
bluetooth: use get/put_unaligned_* helpers
...
The helper function hrtimer_callback_running() is used in
kernel/hrtimer.c as well as in the updated net/can/bcm.c which now
supports hrtimers. Moving the helper function to hrtimer.h removes the
duplicate definition in the C-files.
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In rose_node_start() as well as in rose_node_stop() __acquires() and
spin_lock_bh() were wrongly passing rose_neigh_list_lock instead of
rose_node_list_lock arguments.
Signed-off-by: Bernard Pidoux <f6bvp@amsat.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This trivial fix retrieves the network namespace from frag queue
and use it to get the network device in the right namespace.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a netdev is moved across namespaces with the
'dev_change_net_namespace' function, the 'device_rename' function is
used to fixup kobject and refresh the sysfs tree. The device_rename
function will call kobject_rename and this one will check if there is
an object with the same name and this is the case because we are
renaming the object with the same name.
The use of 'device_rename' seems for me wrong because we usually don't
rename it but just move it across namespaces. As we just want to do a
mini "netdev_[un]register", IMO the functions
'netdev_[un]register_kobject' should be used instead, like an usual
network device [un]registering.
This patch replace device_rename by netdev_unregister_kobject,
followed by netdev_register_kobject.
The netdev_register_kobject will call device_initialize and will raise
a warning indicating the device was already initialized. In order to
fix that, I split the device initialization into a separate function
and use it together with 'netdev_register_kobject' into
register_netdevice. So we can safely call 'netdev_register_kobject' in
'dev_change_net_namespace'.
This fix will allow to properly use the sysfs per namespace which is
coming from -mm tree.
Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com>
Acked-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The forwarding table binary interface (my bad choice), only exposes
the port number of the first 8 bits. The bridge code was limited to
256 ports at the time, but now the kernel supports up 1024 ports, so
the upper bits are lost when doing:
brctl showmacs
The fix is to squeeze the extra bits into small hole left in data
structure, to maintain binary compatiablity.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Makes the intention of the nested min/max clear.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove the fixed size channels[NR_CPUS] array in net/core/dev.c and
dynamically allocate array based on nr_cpu_ids.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
WARN_ON_ONCE() gives a stack trace including the full module list.
Having this in the kernel dump for the timeout case in the
generic netdev watchdog will help us see quicker which driver
is involved. It also allows us to collect statistics
and patterns in terms of which drivers have this event occuring.
Suggested by Andrew Morton
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
One finds all kinds of crazy things with some shell pipelining.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace proc_net_fops_create with proc_create_data.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace create_proc_entry with specially created for this purpose proc_create.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The check for PDE->data != NULL becomes useless after the replacement
of proc_net_fops_create with proc_create_data.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simply replace proc_create and further data assigned with proc_create_data.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simply replace proc_create and further data assigned with proc_create_data.
proc_atm_dev_ops holds proper referrence.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simply replace proc_create and further data assigned with proc_create_data.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simply replace proc_create and further data assigned with proc_create_data.
Additionally, there is no need to assign NULL to PDE->data after creation,
/proc generic has already done this for us.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simply replace proc_create and further data assigned with proc_create_data.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Simply replace proc_create and further data assigned with proc_create_data.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename div64_64 to div64_u64 to make it consistent with the other divide
functions, so it clearly includes the type of the divide. Move its definition
to math64.h as currently no architecture overrides the generic implementation.
They can still override it of course, but the duplicated declarations are
avoided.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Looks like 5d2cdcd4e8 ("mac80211: get a
TKIP phase key from skb") got the shifts wrong.
Noticed by sparse:
net/mac80211/tkip.c:234:25: warning: right shift by bigger than source value
net/mac80211/tkip.c:235:25: warning: right shift by bigger than source value
net/mac80211/tkip.c:236:25: warning: right shift by bigger than source value
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This reorders the open code so that WDS peer STA info entries
are added after the corresponding interface is added to the
driver so that driver callbacks aren't invoked out of order.
Also make any master device startup fatal.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Rather than just disallowing the zero address, disallow all
invalid ones.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Drivers can rightfully assume that they get a beacon_control
if the beacon is set.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The last hunk from the commit dae50295 (ipv4/ipv6 compat: Fix SSM
applications on 64bit kernels.) escaped from the compat_ipv6_setsockopt
to the ipv6_getsockopt (I guess due to patch smartness wrt searching
for context) thus breaking 32-bit and 64-bit-without-compat compilation.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (53 commits)
tcp: Overflow bug in Vegas
[IPv4] UFO: prevent generation of chained skb destined to UFO device
iwlwifi: move the selects to the tristate drivers
ipv4: annotate a few functions __init in ipconfig.c
atm: ambassador: vcc_sf semaphore to mutex
MAINTAINERS: The socketcan-core list is subscribers-only.
netfilter: nf_conntrack: padding breaks conntrack hash on ARM
ipv4: Update MTU to all related cache entries in ip_rt_frag_needed()
sch_sfq: use del_timer_sync() in sfq_destroy()
net: Add compat support for getsockopt (MCAST_MSFILTER)
net: Several cleanups for the setsockopt compat support.
ipvs: fix oops in backup for fwmark conn templates
bridge: kernel panic when unloading bridge module
bridge: fix error handling in br_add_if()
netfilter: {nfnetlink,ip,ip6}_queue: fix skb_over_panic when enlarging packets
netfilter: x_tables: fix net namespace leak when reading /proc/net/xxx_tables_names
netfilter: xt_TCPOPTSTRIP: signed tcphoff for ipv6_skip_exthdr() retval
tcp: Limit cwnd growth when deferring for GSO
tcp: Allow send-limited cwnd to grow up to max_burst when gso disabled
[netdrvr] gianfar: Determine TBIPA value dynamically
...
- Operations are now a shared const function block as with most other Linux
objects
- Introduce wrappers for some optional functions to get consistent behaviour
- Wrap put_char which used to be patched by the tty layer
- Document which functions are needed/optional
- Make put_char report success/fail
- Cache the driver->ops pointer in the tty as tty->ops
- Remove various surplus lock calls we no longer need
- Remove proc_write method as noted by Alexey Dobriyan
- Introduce some missing sanity checks where certain driver/ldisc
combinations would oops as they didn't check needed methods were present
[akpm@linux-foundation.org: fix fs/compat_ioctl.c build]
[akpm@linux-foundation.org: fix isicom]
[akpm@linux-foundation.org: fix arch/ia64/hp/sim/simserial.c build]
[akpm@linux-foundation.org: fix kgdb]
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
From: Lachlan Andrew <lachlan.andrew@gmail.com>
There is an overflow bug in net/ipv4/tcp_vegas.c for large BDPs
(e.g. 400Mbit/s, 400ms). The multiplication (old_wnd *
vegas->baseRTT) << V_PARAM_SHIFT overflows a u32.
[ Fix tcp_veno.c too, it has similar calculations. -DaveM ]
Signed-off-by: David S. Miller <davem@davemloft.net>
Problem: ip_append_data() could wrongly generate a chained skb for
devices which support UFO. When sk_write_queue is not empty
(e.g. MSG_MORE), __instead__ of appending data into the next nr_frag
of the queued skb, a new chained skb is created.
I would normally assume UFO device should get data in nr_frags and not
in frag_list. Later the udp4_hwcsum_outgoing() resets csum to NONE
and skb_gso_segment() has oops.
Proposal:
1. Even length is less than mtu, employ ip_ufo_append_data()
and append data to the __existed__ skb in the sk_write_queue.
2. ip_ufo_append_data() is fixed due to a wrong manipulation of
peek-ing and later enqueue-ing of the same skb. Now, enqueuing is
always performed, because on error the further
ip_flush_pending_frames() would release the queued skb.
Signed-off-by: Kostya B <bkostya@hotmail.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
A few functions are only used from __init context.
So annotate these with __init for consistency and silence
the following warnings:
WARNING: net/ipv4/built-in.o(.text+0x2a876): Section mismatch
in reference from the function ic_bootp_init() to
the variable .init.data:bootp_packet_type
WARNING: net/ipv4/built-in.o(.text+0x2a907): Section mismatch
in reference from the function ic_bootp_cleanup() to
the variable .init.data:bootp_packet_type
Note: The warnings only appear with CONFIG_DEBUG_SECTION_MISMATCH=y
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'audit.b50' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
[PATCH] new predicate - AUDIT_FILETYPE
[patch 2/2] Use find_task_by_vpid in audit code
[patch 1/2] audit: let userspace fully control TTY input auditing
[PATCH 2/2] audit: fix sparse shadowed variable warnings
[PATCH 1/2] audit: move extern declarations to audit.h
Audit: MAINTAINERS update
Audit: increase the maximum length of the key field
Audit: standardize string audit interfaces
Audit: stop deadlock from signals under load
Audit: save audit_backlog_limit audit messages in case auditd comes back
Audit: collect sessionid in netlink messages
Audit: end printk with newline
Some drivers have duplicated unlikely() macros. IS_ERR() already has
unlikely() in itself.
This patch cleans up such pointless code.
Signed-off-by: Hirofumi Nakagawa <hnakagawa@miraclelinux.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jeff Garzik <jeff@garzik.org>
Cc: Paul Clements <paul.clements@steeleye.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: David Brownell <david-b@pacbell.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Carsten Otte <cotte@de.ibm.com>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Probably interface misuse, because of the way iterating over hashbin is done.
However! Printing of socket number ("IrNET socket %d - ", i++") made conversion
to proper ->start/->next difficult enough to do blindly without hardware.
Said that, please apply.
Remove useless comment while I am it.
Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
commit 0794935e "[NETFILTER]: nf_conntrack: optimize hash_conntrack()"
results in ARM platforms hashing uninitialised padding. This padding
doesn't exist on other architectures.
Fix this by replacing NF_CT_TUPLE_U_BLANK() with memset() to ensure
everything is initialised. There were only 4 bytes that
NF_CT_TUPLE_U_BLANK() wasn't clearing anyway (or 12 bytes on ARM).
Signed-off-by: Philip Craig <philipc@snapgear.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add struct net_device parameter to ip_rt_frag_needed() and update MTU to
cache entries where ifindex is specified. This is similar to what is
already done in ip_rt_redirect().
Signed-off-by: Timo Teras <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for getsockopt for MCAST_MSFILTER for
both IPv4 and IPv6. It depends on the previous setsockopt patch,
and uses the same method.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) added missing "__user" for kgsr and kgf pointers
2) verify read for only GROUP_FILTER_SIZE(0). The group_filter
structure definition (via RFC) includes space for one source
in the source list array, but that source need not be present.
So, sizeof(group_filter) > GROUP_FILTER_SIZE(0). Fixed
the user read-check for minimum length to use the smaller size.
3) remove unneeded "&" for gf_slist addresses
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes bug http://bugzilla.kernel.org/show_bug.cgi?id=10556
where conn templates with protocol=IPPROTO_IP can oops backup box.
Result from ip_vs_proto_get() should be checked because
protocol value can be invalid or unsupported in backup. But
for valid message we should not fail for templates which use
IPPROTO_IP. Also, add checks to validate message limits and
connection state. Show state NONE for templates using IPPROTO_IP.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a race condition when unloading bridge and netfilter.
The problem happens if __fake_rtable is in use by a skb
coming in, while someone starts to unload bridge.ko.
br_netfilter_fini() is called at the beginning of unload
in br_deinit() while skbs still are being forwarded and
transferred to local ip stack. Thus there is a possibility
of the __fake_rtable pointer not being removed in a skb that
goes up to ip stack. This results in a kernel panic, as
ip_rcv() calls the input-function of __fake_rtable, which
is NULL.
Moving the call of br_netfilter_fini() to the end of
br_deinit() solves the problem.
Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When device is added to bridge its refcnt is incremented (in new_nbp()), but if
error occurs during further br_add_if() operations this counter is not
decremented back. Fix it by adding dev_put() call in the error path.
Signed-off-by: Volodymyr G Lukiianyk <volodymyrgl@gmail.com>
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While reinjecting *bigger* modified versions of IPv6 packets using
libnetfilter_queue, things work fine on a 2.6.24 kernel (2.6.22 too)
but I get the following on recents kernels (2.6.25, trace below is
against today's net-2.6 git tree):
skb_over_panic: text:c04fddb0 len:696 put:632 head:f7592c00 data:f7592c00 tail:0xf7592eb8 end:0xf7592e80 dev:eth0
------------[ cut here ]------------
invalid opcode: 0000 [#1] PREEMPT
Process sendd (pid: 3657, ti=f6014000 task=f77c31d0 task.ti=f6014000)
Stack: c071e638 c04fddb0 000002b8 00000278 f7592c00 f7592c00 f7592eb8 f7592e80
f763c000 f6bc5200 f7592c40 f6015c34 c04cdbfc f6bc5200 00000278 f6015c60
c04fddb0 00000020 f72a10c0 f751b420 00000001 0000000a 000002b8 c065582c
Call Trace:
[<c04fddb0>] ? nfqnl_recv_verdict+0x1c0/0x2e0
[<c04cdbfc>] ? skb_put+0x3c/0x40
[<c04fddb0>] ? nfqnl_recv_verdict+0x1c0/0x2e0
[<c04fd115>] ? nfnetlink_rcv_msg+0xf5/0x160
[<c04fd03e>] ? nfnetlink_rcv_msg+0x1e/0x160
[<c04fd020>] ? nfnetlink_rcv_msg+0x0/0x160
[<c04f8ed7>] ? netlink_rcv_skb+0x77/0xa0
[<c04fcefc>] ? nfnetlink_rcv+0x1c/0x30
[<c04f8c73>] ? netlink_unicast+0x243/0x2b0
[<c04cfaba>] ? memcpy_fromiovec+0x4a/0x70
[<c04f9406>] ? netlink_sendmsg+0x1c6/0x270
[<c04c8244>] ? sock_sendmsg+0xc4/0xf0
[<c011970d>] ? set_next_entity+0x1d/0x50
[<c0133a80>] ? autoremove_wake_function+0x0/0x40
[<c0118f9e>] ? __wake_up_common+0x3e/0x70
[<c0342fbf>] ? n_tty_receive_buf+0x34f/0x1280
[<c011d308>] ? __wake_up+0x68/0x70
[<c02cea47>] ? copy_from_user+0x37/0x70
[<c04cfd7c>] ? verify_iovec+0x2c/0x90
[<c04c837a>] ? sys_sendmsg+0x10a/0x230
[<c011967a>] ? __dequeue_entity+0x2a/0xa0
[<c011970d>] ? set_next_entity+0x1d/0x50
[<c0345397>] ? pty_write+0x47/0x60
[<c033d59b>] ? tty_default_put_char+0x1b/0x20
[<c011d2e9>] ? __wake_up+0x49/0x70
[<c033df99>] ? tty_ldisc_deref+0x39/0x90
[<c033ff20>] ? tty_write+0x1a0/0x1b0
[<c04c93af>] ? sys_socketcall+0x7f/0x260
[<c0102ff9>] ? sysenter_past_esp+0x6a/0x91
[<c05f0000>] ? snd_intel8x0m_probe+0x270/0x6e0
=======================
Code: 00 00 89 5c 24 14 8b 98 9c 00 00 00 89 54 24 0c 89 5c 24 10 8b 40 50 89 4c 24 04 c7 04 24 38 e6 71 c0 89 44 24 08 e8 c4 46 c5 ff <0f> 0b eb fe 55 89 e5 56 89 d6 53 89 c3 83 ec 0c 8b 40 50 39 d0
EIP: [<c04ccdfc>] skb_over_panic+0x5c/0x60 SS:ESP 0068:f6015bf8
Looking at the code, I ended up in nfq_mangle() function (called by
nfqnl_recv_verdict()) which performs a call to skb_copy_expand() due to
the increased size of data passed to the function. AFAICT, it should ask
for 'diff' instead of 'diff - skb_tailroom(e->skb)'. Because the
resulting sk_buff has not enough space to support the skb_put(skb, diff)
call a few lines later, this results in the call to skb_over_panic().
The patch below asks for allocation of a copy with enough space for
mangled packet and the same amount of headroom as old sk_buff. While
looking at how the regression appeared (e2b58a67), I noticed the same
pattern in ipq_mangle_ipv6() and ipq_mangle_ipv4(). The patch corrects
those locations too.
Tested with bigger reinjected IPv6 packets (nfqnl_mangle() path), things
are ok (2.6.25 and today's net-2.6 git tree).
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The seq_open_net() call should be accompanied with seq_release_net() one.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
if tcphoff remains unsigned, a negative ipv6_skip_exthdr() return value will
go unnoticed,
Signed-off-by: Roel Kluin <12o3l@tiscali.nl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes inappropriately large cwnd growth on sender-limited flows
when GSO is enabled, limiting cwnd growth to 64k.
Signed-off-by: John Heffner <johnwheffner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This changes the logic in tcp_is_cwnd_limited() so that cwnd may grow
up to tcp_max_burst() even when sk_can_gso() is false, or when
sysctl_tcp_tso_win_divisor != 0.
Signed-off-by: John Heffner <johnwheffner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
iwlwifi: Allow building iwl3945 without iwl4965.
wireless: Fix compile error with wifi & leds
tcp: Fix slab corruption with ipv6 and tcp6fuzz
ipv4/ipv6 compat: Fix SSM applications on 64bit kernels.
[IPSEC]: Use digest_null directly for auth
sunrpc: fix missing kernel-doc
can: Fix copy_from_user() results interpretation
Revert "ipv6: Fix typo in net/ipv6/Kconfig"
tipc: endianness annotations
ipv6: result of csum_fold() is already 16bit, no need to cast
[XFRM] AUDIT: Fix flowlabel text format ambibuity.
Previously I added sessionid output to all audit messages where it was
available but we still didn't know the sessionid of the sender of
netlink messages. This patch adds that information to netlink messages
so we can audit who sent netlink messages.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Fix build error caused by commit
e82404ad61 ("iwlwifi: Select
LEDS_CLASS.") from David Miller:
Since MAC80211_LEDS is selected by wireless drivers it must select its
own dependencies otherwise a build error may occur (kbuild will select
the symbol regardless of "depends" constraints).
Signed-off-By: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
This fixes a regression added by ec3c0982a2
("[TCP]: TCP_DEFER_ACCEPT updates - process as established")
tcp_v6_do_rcv()->tcp_rcv_established(), the latter goes to step5, where
eventually skb can be freed via tcp_data_queue() (drop: label), then if
check for tcp_defer_accept_check() returns true and thus
tcp_rcv_established() returns -1, which forces tcp_v6_do_rcv() to jump
to reset: label, which in turn will pass through discard: label and free
the same skb again.
Tested by Eric Sesterhenn.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-By: Patrick McManus <mcmanus@ducksong.com>
Add support on 64-bit kernels for seting 32-bit compatible MCAST*
socket options.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously digest_null had no setkey function which meant that
we used hmac(digest_null) for IPsec since IPsec always calls
setkey. Now that digest_null has a setkey we no longer need to
do that.
In fact when only confidentiality is specified for ESP we already
use digest_null directly. However, when the null algorithm is
explicitly specified by the user we still opt for hmac(digest_null).
This patch removes this discrepancy. I have not added a new compat
name for it because by chance it wasn't actualy possible for the user
to specify the name hmac(digest_null) due to a key length check in
xfrm_user (which I found out when testing that compat name :)
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix missing sunrpc kernel-doc:
Warning(linux-2.6.25-git7//net/sunrpc/xprt.c:451): No description found for parameter 'action'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both copy_to_ and _from_user return the number of bytes, that failed to
reach their destination, not the 0/-EXXX values.
Based on patch from Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Oliver Hartkopp <oliver.hartkopp@volkswagen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flowlabel text format was not correct and thus ambiguous.
For example, 0x00123 or 0x01203 are formatted as 0x123.
This is not what audit tools want.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
I found some places, that erroneously return the value obtained from
the copy_to_user() call: if some amount of bytes were not able to get
to the user (this is what this one returns) the proper behavior is to
return the -EFAULT error, not that number itself.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two is used in the wrong context here, as you are connecting to an
IPv6 network over IPv4; not connecting two IPv6 networks to an IPv4
one.
Signed-off-by: Michael Beasley <youvegotmoxie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC3542 tells that IPV6_CHECKSUM socket option in the IPPROTO_IPV6
level is not allowed on ICMPv6 sockets. IPPROTO_RAW level
IPV6_CHECKSUM socket option (a Linux extension) is still allowed.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_probe has a bounds-checking bug that causes many programs (less,
python) to crash reading /proc/net/tcp_probe. When it outputs a log
line to the reader, it only checks if that line alone will fit in the
reader's buffer, rather than that line and all the previous lines it
has already written.
tcpprobe_read also returns the wrong value if copy_to_user fails--it
just passes on the return value of copy_to_user (number of bytes not
copied), which makes a failure look like a success.
This patch fixes the buffer overflow and sets the return value to
-EFAULT if copy_to_user fails.
Patch is against latest net-2.6; tested briefly and seems to fix the
crashes in less and python.
Signed-off-by: Tom Quetchenbach <virtualphtn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the ethtool user-space application, tg3 and natsemi over-ride the
default implementation of dump_eeprom(). In both tg3_dump_eeprom() and
natsemi_dump_eeprom(), there is a magic number check which is not
present in the default implementation.
Commit b131dd5d ("[ETHTOOL]: Add support for large eeproms") snipped
the code which copied the ethtool_eeprom structure back to
user-space. tg3 and natsemi are over-writing the magic number field
and then checking it in user-space. With the ethtool_eeprom copy
removed, the check is failing.
The fix is simple. Add the ethtool_eeprom copy back.
Signed-off-by: Mandeep Singh Baines <msb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/key/af_key.c: In function ‘pfkey_spddelete’:
net/key/af_key.c:2359: warning: ‘pol_ctx’ may be used uninitialized in
this function
When CONFIG_SECURITY_NETWORK_XFRM isn't set,
security_xfrm_policy_alloc() is an inline that doesn't set pol_ctx, so
this seemed like the easiest fix short of using *uninitialized_var(pol_ctx).
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a regression in the RXKAD security module introduced in:
commit 91e916cffe
Author: Al Viro <viro@ftp.linux.org.uk>
Date: Sat Mar 29 03:08:38 2008 +0000
net/rxrpc trivial annotations
A variable was declared as a 16-bit type rather than a 32-bit type.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-with-apologies-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-of-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (80 commits)
SUNRPC: Invalidate the RPCSEC_GSS session if the server dropped the request
make nfs_automount_list static
NFS: remove duplicate flags assignment from nfs_validate_mount_data
NFS - fix potential NULL pointer dereference v2
SUNRPC: Don't change the RPCSEC_GSS context on a credential that is in use
SUNRPC: Fix a race in gss_refresh_upcall()
SUNRPC: Don't disconnect more than once if retransmitting NFSv4 requests
SUNRPC: Remove the unused export of xprt_force_disconnect
SUNRPC: remove XS_SENDMSG_RETRY
SUNRPC: Protect creds against early garbage collection
NFSv4: Attempt to use machine credentials in SETCLIENTID calls
NFSv4: Reintroduce machine creds
NFSv4: Don't use cred->cr_ops->cr_name in nfs4_proc_setclientid()
nfs: fix printout of multiword bitfields
nfs: return negative error value from nfs{,4}_stat_to_errno
NLM/lockd: Ensure client locking calls use correct credentials
NFS: Remove the buggy lock-if-signalled case from do_setlk()
NLM/lockd: Fix a race when cancelling a blocking lock
NLM/lockd: Ensure that nlmclnt_cancel() returns results of the CANCEL call
NLM: Remove the signal masking in nlmclnt_proc/nlmclnt_cancel
...
* 'for-linus' of git://linux-nfs.org/~bfields/linux: (52 commits)
knfsd: clear both setuid and setgid whenever a chown is done
knfsd: get rid of imode variable in nfsd_setattr
SUNRPC: Use unsigned loop and array index in svc_init_buffer()
SUNRPC: Use unsigned index when looping over arrays
SUNRPC: Update RPC server's TCP record marker decoder
SUNRPC: RPC server still uses 2.4 method for disabling TCP Nagle
NLM: don't let lockd exit on unexpected svc_recv errors (try #2)
NFS: don't let nfs_callback_svc exit on unexpected svc_recv errors (try #2)
Use a zero sized array for raw field in struct fid
nfsd: use static memory for callback program and stats
SUNRPC: remove svc_create_thread()
nfsd: fix comment
lockd: Fix stale nlmsvc_unlink_block comment
NFSD: Strip __KERNEL__ testing from unexported header files.
sunrpc: make token header values less confusing
gss_krb5: consistently use unsigned for seqnum
NFSD: Remove NFSv4 dependency on NFSv3
SUNRPC: Remove PROC_FS dependency
NFSD: Use "depends on" for PROC_FS dependency
nfsd: move most of fh_verify to separate function
...
RFC 2203 requires the server to drop the request if it believes the
RPCSEC_GSS context is out of sequence. The problem is that we have no way
on the client to know why the server dropped the request. In order to avoid
spinning forever trying to resend the request, the safe approach is
therefore to always invalidate the RPCSEC_GSS context on every major
timeout.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits)
tun: Multicast handling in tun_chr_ioctl() needs proper locking.
[NET]: Fix heavy stack usage in seq_file output routines.
[AF_UNIX] Initialise UNIX sockets before general device initcalls
[RTNETLINK]: Fix bogus ASSERT_RTNL warning
iwlwifi: Fix built-in compilation of iwlcore (part 2)
tun: Fix minor race in TUNSETLINK ioctl handling.
ppp_generic: use stats from net_device structure
iwlwifi: Don't unlock priv->mutex if it isn't locked
wireless: rndis_wlan: modparam_workaround_interval is never below 0.
prism54: prism54_get_encode() test below 0 on unsigned index
mac80211: update mesh EID values
b43: Workaround DMA quirks
mac80211: fix use before check of Qdisc length
net/mac80211/rx.c: fix off-by-one
mac80211: Fix race between ieee80211_rx_bss_put and lookup routines.
ath5k: Fix radio identification on AR5424/2424
ssb: Fix all-ones boardflags
b43: Add more btcoexist workarounds
b43: Fix HostFlags data types
b43: Workaround invalid bluetooth settings
...
Plan C: we can follow the Al Viro's proposal about %n like in this patch.
The same applies to udp, fib (the /proc/net/route file), rt_cache and
sctp debug. This is minus ~150-200 bytes for each.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When drivers call request_module(), it tries to do something with UNIX
sockets and triggers a 'runaway loop modprobe net-pf-1' warning. Avoid
this by initialising AF_UNIX support earlier.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ASSERT_RTNL uses mutex_trylock to test whether the rtnl_mutex is
held. This bogus warnings when running in atomic context, which
f.e. happens when adding secondary unicast addresses through
macvlan or vlan or when synchronizing multicast addresses from
wireless devices.
Mid-term we might want to consider moving all address updates
to process context since the locking seems overly complicated,
for now just fix the bogus warning by changing ASSERT_RTNL to
use mutex_is_locked().
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes use of Qdisc length in requeue function, before we checked
the reference is valid. (Adrian Bunk's catch)
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch fixes an off-by-one in net/mac80211/rx.c introduced by
commit 8318d78a44
(cfg80211 API for channels/bitrates, mac80211 and driver conversion)
and spotted by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The put routine first decrements the users counter and then
(if it is zero) locks the sta_bss_lock and removes one from
the list and the hash.
Thus, any of ieee80211_sta_config_auth, ieee80211_rx_bss_get
or ieee80211_rx_mesh_bss_get can race with it by finding a
bss that is about to get kfree-ed.
Using atomic_dec_and_lock in ieee80211_rx_bss_put takes care
of this race.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There are two structures named wmm_info and wmm_param, they are used while
parsing the beacon frame. (Check the function ieee802_11_parse_elems).
Certain APs like D-link does not set the fifth bit in WMM IE.
While sending the association request to n-only ap it checks for wmm_ie.
If it is set then only ieee80211_ht_cap is sent during association request.
So n-only association fails.
And this patch fixes this problem by copying the wmm_info to wmm_ie,
which enables the "wmm" flag in iee80211_send_assoc.
Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Acked-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Clean up: Suppress a harmless compiler warning.
Index rq_pages[] with an unsigned type. Make "pages" unsigned as well,
as it never represents a value less than zero.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: Suppress a harmless compiler warning in the RPC server related
to array indices.
ARRAY_SIZE() returns a size_t, so use unsigned type for a loop index when
looping over arrays.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Clean up: Update the RPC server's TCP record marker decoder to match the
constructs used by the RPC client's TCP socket transport.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Use the 2.6 method for disabling TCP Nagle in the kernel's RPC server.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Now that the nfs4 callback thread uses the kthread API, there are no
more users of svc_create_thread(). Remove it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
g_make_token_header() and g_token_size() add two too many, and
therefore their callers pass in "(logical_value - 2)" rather
than "logical_value" as hard-coded values which causes confusion.
This dates back to the original g_make_token_header which took an
optional token type (token_id) value and added it to the token.
This was removed, but the routine always adds room for the token_id
rather than not.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Consistently use unsigned (u32 vs. s32) for seqnum.
In get_mic function, send the local copy of seq_send,
rather than the context version.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
net/sunrpc/svc.c: In function '__svc_create_thread':
net/sunrpc/svc.c:587: warning: 'oldmask.bits[0u]' may be used uninitialized in this function
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Tom Tucker <tom@opengridcomputing.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
cleanup: When adding new encryption types, the checksum length
can be different for each enctype. Face the fact that the
current code only supports DES which has a checksum length of 8.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
cleanup: Fix grammer/typos to use "too" instead of "to"
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
SVCRDMA: Add check for XPT_CLOSE in svc_rdma_send
The svcrdma transport can crash if a send is waiting for an
empty SQ slot and the connection is closed due to an asynchronous error.
The crash is caused when svc_rdma_send attempts to send on a deleted
QP.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
In function svcauth_gss_accept() (net/sunrpc/auth_gss/svcauth_gss.c) the
code that handles GSS integrity and decryption failures should be
returning GARBAGE_ARGS as specified in RFC 2203, sections 5.3.3.4.2 and
5.3.3.4.3.
Reviewed-by: Greg Banks <gnb@sgi.com>
Signed-off-by: Harshula Jayasuriya <harshula@sgi.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
svc_recv() calls alloc_page(), and if it fails it does a 500ms
uninterruptible sleep and then reattempts. There doesn't seem to be any
real reason for this to be uninterruptible, so change it to an
interruptible sleep. Also check for kthread_stop() and signalled() after
setting the task state to avoid races that might lead to sleeping after
kthread_stop() wakes up the task.
I've done some very basic smoke testing with this, but obviously it's
hard to test the actual changes since this all depends on an
alloc_page() call failing.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This field is set once and never used; probably some artifact of an
earlier implementation idea.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This adds IPv6 support to the interfaces that are used to express nfsd
exports. All addressed are stored internally as IPv6; backwards
compatibility is maintained using mapped addresses.
Thanks to Bruce Fields, Brian Haley, Neil Brown and Hideaki Joshifuji
for comments
Signed-off-by: Aurelien Charbon <aurelien.charbon@bull.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Brian Haley <brian.haley@hp.com>
Cc: YOSHIFUJI Hideaki / 吉藤英明 <yoshfuji@linux-ipv6.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
When using kthreads that call into svc_recv, we want to make sure that
they do not block there for a long time when we're trying to take down
the kthread.
This patch changes svc_recv() to check kthread_should_stop() at the same
places that it checks to see if it's signalled(). Also check just before
svc_recv() tries to schedule(). By making sure that we check it just
after setting the task state we can avoid having to use any locking or
signalling to ensure it doesn't block for a long time.
There's still a chance of a 500ms sleep if alloc_page() fails, but
that should be a rare occurrence and isn't a terribly long time in
the context of a kthread being taken down.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Needed since the plan is to not have a svc_create_thread helper and to
have current users of that function just call kthread_run directly.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
iwlwifi: Fix built-in compilation of iwlcore
net: Unexport move_addr_to_{kernel,user}
rt2x00: Select LEDS_CLASS.
iwlwifi: Select LEDS_CLASS.
leds: Do not guard NEW_LEDS with HAS_IOMEM
[IPSEC]: Fix catch-22 with algorithm IDs above 31
time: Export set_normalized_timespec.
tcp: Make use of before macro in tcp_input.c
hamradio: Remove unneeded and deprecated cli()/sti() calls in dmascc.c
[NETNS]: Remove empty ->init callback.
[DCCP]: Convert do_gettimeofday() to getnstimeofday().
[NETNS]: Don't initialize err variable twice.
[NETNS]: The ip6_fib_timer can work with garbage on net namespace stop.
[IPV4]: Convert do_gettimeofday() to getnstimeofday().
[IPV4]: Make icmp_sk_init() static.
[IPV6]: Make struct ip6_prohibit_entry_template static.
tcp: Trivial fix to correct function name in a comment in net/ipv4/tcp.c
[NET]: Expose netdevice dev_id through sysfs
skbuff: fix missing kernel-doc notation
[ROSE]: Fix soft lockup wrt. rose_node_list_lock
After the removal of the Solaris binary emulation the exports of
move_addr_to_{kernel,user} are no longer used.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As it stands it's impossible to use any authentication algorithms
with an ID above 31 portably. It just happens to work on x86 but
fails miserably on ppc64.
The reason is that we're using a bit mask to check the algorithm
ID but the mask is only 32 bits wide.
After looking at how this is used in the field, I have concluded
that in the long term we should phase out state matching by IDs
because this is made superfluous by the reqid feature. For current
applications, the best solution IMHO is to allow all algorithms when
the bit masks are all ~0.
The following patch does exactly that.
This bug was identified by IBM when testing on the ppc64 platform
using the NULL authentication algorithm which has an ID of 251.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/juhl/trivial: (24 commits)
DOC: A couple corrections and clarifications in USB doc.
Generate a slightly more informative error msg for bad HZ
fix typo "is" -> "if" in Makefile
ext*: spelling fix prefered -> preferred
DOCUMENTATION: Use newer DEFINE_SPINLOCK macro in docs.
KEYS: Fix the comment to match the file name in rxrpc-type.h.
RAID: remove trailing space from printk line
DMA engine: typo fixes
Remove unused MAX_NODES_SHIFT
MAINTAINERS: Clarify access to OCFS2 development mailing list.
V4L: Storage class should be before const qualifier (sn9c102)
V4L: Storage class should be before const qualifier
sonypi: Storage class should be before const qualifier
intel_menlow: Storage class should be before const qualifier
DVB: Storage class should be before const qualifier
arm: Storage class should be before const qualifier
ALSA: Storage class should be before const qualifier
acpi: Storage class should be before const qualifier
firmware_sample_driver.c: fix coding style
MAINTAINERS: Add ati_remote2 driver
...
Fixed up trivial conflicts in firmware_sample_driver.c
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
rose: Socket lock was not released before returning to user space
hci_usb: remove code obfuscation
drivers/net/appletalk: use time_before, time_before_eq, etc
drivers/atm: use time_before, time_before_eq, etc
hci_usb: do not initialize static variables to 0
tg3: 5701 DMA corruption fix
atm nicstar: Removal of debug code containing deprecated calls to cli()/sti()
iwlwifi: Fix unconditional access to station->tidp[].agg.
netfilter: Fix SIP conntrack build with NAT disabled.
netfilter: Fix SCTP nat build.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel: (62 commits)
sched: build fix
sched: better rt-group documentation
sched: features fix
sched: /debug/sched_features
sched: add SCHED_FEAT_DEADLINE
sched: debug: show a weight tree
sched: fair: weight calculations
sched: fair-group: de-couple load-balancing from the rb-trees
sched: fair-group scheduling vs latency
sched: rt-group: optimize dequeue_rt_stack
sched: debug: add some debug code to handle the full hierarchy
sched: fair-group: SMP-nice for group scheduling
sched, cpuset: customize sched domains, core
sched, cpuset: customize sched domains, docs
sched: prepatory code movement
sched: rt: multi level group constraints
sched: task_group hierarchy
sched: fix the task_group hierarchy for UID grouping
sched: allow the group scheduler to have multiple levels
sched: mix tasks and groups
...
As you can see, there's no zero_it arg (in fact code always uses __GFP_ZERO).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
The netns start-stop engine can happily live with any of
init or exit callbacks set to NULL.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
What do_gettimeofday() does is to call getnstimeofday() and
to convert the result from timespec{} to timeval{}.
We do not always need timeval{} and we can convert timespec{}
when we really need (to print).
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ip6_route_net_init() performs some unneeded actions.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The del_timer() function doesn't guarantee, that the timer callback
is not active by the time it exits.
Thus, the fib6_net_exit() may kfree() all the data, that is required
by the fib6_run_gc(). The race window is tiny, but slab poisoning can
trigger this bug.
Using del_timer_sync() will cure this.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
What do_gettimeofday() does is to call getnstimeofday() and
to convert the result from timespec{} to timeval{}.
After that, these callers convert the result again to msec.
Use getnstimeofday() and convert the units at once.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes the needlessly global icmp_sk_init() static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes the needlessly global struct
ip6_prohibit_entry_template static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a trivial fix to correct function name in a comment in
net/ipv4/tcp.c.
Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Expose dev_id to userspace, because it helps to disambiguate between
interfaces where the MAC address is unique.
This should allow us to simplify the handling of persistent naming for
S390 network devices in udev -- because it can depend on a simple
attribute of the device like the other match criteria, rather than
having a special case for SUBSYSTEMS=="ccwgroup".
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported by Ingo Molnar.
The SIP helper is also useful without NAT. This patch adds an ifdef
around the RTP call optimization for NATed clients.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a server rejects our credential with an AUTH_REJECTEDCRED or similar,
we need to refresh the credential and then retry the request.
However, we do want to allow any requests that are in flight to finish
executing, so that we can at least attempt to process the replies that
depend on this instance of the credential.
The solution is to ensure that gss_refresh() looks up an entirely new
RPCSEC_GSS credential instead of attempting to create a context for the
existing invalid credential.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the downcall completes before we get the spin_lock then we currently
fail to refresh the credential.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv4 requires us to ensure that we break the TCP connection before we're
allowed to retransmit a request. However in the case where we're
retransmitting several requests that have been sent on the same
connection, we need to ensure that we don't interfere with the attempt to
reconnect and/or break the connection again once it has been established.
We therefore introduce a 'connection' cookie that is bumped every time a
connection is broken. This allows requests to track if they need to force a
disconnection.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The condition for exiting from the loop in xs_tcp_send_request() should be
that we find we're not making progress (i.e. number of bytes sent is 0).
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We need to try to ensure that we always use the same credentials whenever
we re-establish the clientid on the server. If not, the server won't
recognise that we're the same client, and so may not allow us to recover
state.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
With the recent change to generic creds, we can no longer use
cred->cr_ops->cr_name to distinguish between RPCSEC_GSS principals and
AUTH_SYS/AUTH_NULL identities. Replace it with the rpc_authops->au_name
instead...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We want to ensure that req->rq_private_buf.len is updated before
req->rq_received, so that call_decode() doesn't use an old value for
req->rq_rcv_buf.len.
In 'call_decode()' itself, instead of using task->tk_status (which is set
using req->rq_received) must use the actual value of
req->rq_private_buf.len when deciding whether or not the received RPC reply
is too short.
Finally ensure that we set req->rq_rcv_buf.len to zero when retrying a
request. A typo meant that we were resetting req->rq_private_buf.len in
call_decode(), and then clobbering that value with the old rq_rcv_buf.len
again in xprt_transmit().
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
..and always destroy using a 'soft' RPC call. Destroying GSS credentials
isn't mandatory; the server can always cope with a few credentials not
getting destroyed in a timely fashion.
This actually fixes a hang situation. Basically, some servers will decide
that the client is crazy if it tries to destroy an RPC context for which
they have sent an RPCSEC_GSS_CREDPROBLEM, and so will refuse to talk to it
for a while.
The regression therefor probably was introduced by commit
0df7fb74fb.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The rest of the networking layer uses SOCK_ASYNC_NOSPACE to signal whether
or not we have someone waiting for buffer memory. Convert the SUNRPC layer
to use the same idiom.
Remove the unlikely()s in xs_udp_write_space and xs_tcp_write_space. In
fact, the most common case will be that there is nobody waiting for buffer
space.
SOCK_NOSPACE is there to tell the TCP layer whether or not the cwnd was
limited by the application window. Ensure that we follow the same idiom as
the rest of the networking layer here too.
Finally, ensure that we clear SOCK_ASYNC_NOSPACE once we wake up, so that
write_space() doesn't keep waking things up on xprt->pending.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
call_verify() can, under certain circumstances, free the RPC slot. In that
case, our cached pointer 'req = task->tk_rqstp' is invalid. Bug was
introduced in commit 220bcc2afd (SUNRPC:
Don't call xprt_release in call refresh).
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>