The current IPSEC rule resolution behavior we have does not work for a
lot of people, even though technically it's an improvement from the
-EAGAIN buisness we had before.
Right now we'll block until the key manager resolves the route. That
works for simple cases, but many folks would rather packets get
silently dropped until the key manager resolves the IPSEC rules.
We can't tell these folks to "set the socket non-blocking" because
they don't have control over the non-block setting of things like the
sockets used to resolve DNS deep inside of the resolver libraries in
libc.
With that in mind I coded up the patch below with some help from
Herbert Xu which provides packet-drop behavior during larval state
resolution, controllable via sysctl and off by default.
This lays the framework to either:
1) Make this default at some point or...
2) Move this logic into xfrm{4,6}_policy.c and implement the
ARP-like resolution queue we've all been dreaming of.
The idea would be to queue packets to the policy, then
once the larval state is resolved by the key manager we
re-resolve the route and push the packets out. The
packets would timeout if the rule didn't get resolved
in a certain amount of time.
Signed-off-by: David S. Miller <davem@davemloft.net>
sys_setsockopt() do not check properly timeout values for
SO_RCVTIMEO/SO_SNDTIMEO, for example it's possible to set negative timeout
values. POSIX do not defines behaviour for sys_setsockopt in case negative
timeouts, but requires that setsockopt() shall fail with -EDOM if the send and
receive timeout values are too big to fit into the timeout fields in the socket
structure.
In current implementation negative timeout can lead to error messages like
"schedule_timeout: wrong timeout value".
Proposed patch:
- checks tv_usec and returns -EDOM if it is wrong
- do not allows to set negative timeout values (sets 0 instead) and outputs
ratelimited information message about such attempts.
Signed-off-By: Vasily Averin <vvs@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
They're the same.
Signed-off-by: Jing Min Zhao <zhaojingmin@vivecode.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing process of T.120 address in OpenLogicalChannelAck signal.
Signed-off-by: Jing Min Zhao <zhaojingmin@vivecode.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to the implementation of H.323, it's not necessary to check
the addresses in Information signals.
Signed-off-by: Jing Min Zhao <zhaojingmin@vivecode.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update get_h225_addr() to meet the changes in ASN.1 types. It was using
field ip6 to access IPv6 TransportAddress, it should be ip according the
ASN.1 definition.
Signed-off-by: Jing Min Zhao <zhaojingmin@vivecode.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
1. Add support for decoding IPv6 address. I know it was manually added in
the header file, but not in the template file. That wouldn't work.
2. Add missing support for decoding T.120 address in OLCA.
3. Remove unnecessary decoding of Information signal.
Signed-off-by: Jing Min Zhao <zhaojingmin@vivecode.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the packet size is changed by the FTP NAT helper, the connection
tracking helper adjusts the sequence number of the newline character
by the size difference. This is wrong because NAT sequence number
adjustment happens after helpers are called, so the unadjusted number
is compared to the already adjusted one.
Based on report by YU, Haitao <yuhaitao@tsinghua.org.cn>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When trying to locate the oldest entry in the history of newline character
sequence numbers, the sequence number of the current entry is incorrectly
compared with the index of the oldest sequence number instead of the number
itself.
Additionally it is not made sure that the current sequence number really
is after the oldest known one.
Based on report by YU, Haitao <yuhaitao@tsinghua.org.cn>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The event cache time must be an absolute value, when no event exists
it is incorrectly set to 1s instead of 1s in the future.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When you replace route via ip r r command the netlink multicast message is
not send. This patch corrects it. NL message is sent with NLM_F_REPLACE
flag.
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=8320
Signed-off-by: Milan Kocian <milon@wq.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use menuconfigs instead of menus, so the whole menu can be disabled at once
instead of going through all options.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Urs Thuermann <urs@isnogud.escape.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
My previous patch that changed the return value of qdisc_restart
incorrectly made the case where dequeue returns empty continue
processing packets.
This patch is based on diagnosis and fix by Patrick McHardy.
Reported-and-debugged-by: Anant Nitya <kernel@prachanda.info>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The L2CAP configuration parameter handling was missing the support
for rejecting unknown options. The capability to reject unknown
options is mandatory since the Bluetooth 1.2 specification. This
patch implements its and also simplifies the parameter parsing.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Remove some unused variables and function arguments related to the
recently removed wireless extensions over rtnetlink.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
rtnl_setlink doesn't allow to change subsets of the flags, just to override
the set entirely by a new one. This means that for simply setting a device
up or down userspace first needs to query the current flags, change it and
send the changed flags back, which is racy and needlessly complicated.
Mask the flags using ifi_change since this is what it is intended for.
For backwards compatibility treat ifi_change == 0 as ~0 (even though it
seems quite unlikely that anyone has been using this so far).
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make the call state names array available even if CONFIG_PROC_FS is
disabled as it's used in other places (such as debugging statements)
too.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a dependency for CONFIG_AF_RXRPC on CONFIG_INET. This fixes this
error:
net/built-in.o: In function `rxrpc_get_peer':
(.text+0x42824): undefined reference to `ip_route_output_key'
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds some casts to shut up the warnings introduced by my
last patch that added a common interator function for xfrm algorightms.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kenji Kaneshige found this race between device removal and
registration. On unregister it is possible for the old device to
exist, because sysfs file is still open. A new device with 'eth%d'
will select the same name, but sysfs kobject register will fial.
The following changes the shutdown order slightly. It hold a removes
the sysfs entries earlier (on unregister_netdevice), but holds a
kobject reference. Then when todo runs the actual last put free
happens.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When icmp_send is called on the local output path before the
packet hits ip_output, skb->dev is not set, causing a crash
when sysctl_icmp_errors_use_inbound_ifaddr is set. This can
happen with the netfilter REJECT target or IPsec tunnels.
Let routing decide the ICMP source address in that case, since the
packet is locally generated there is no inbound interface and
the sysctl should not apply.
The option actually seems to be unfixable broken, on the path
after ip_output() skb->dev points to the outgoing device and
we don't know the incoming device anymore, so its going to do
the absolute wrong thing and pick the address of the outgoing
interface. Add a comment about this.
Reported by Curtis Doty <Curtis@GreenKey.net>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The option is named CONFIG_NF_NAT not CONFIG_IP_NF_NAT. Remove the ifdef
completely since helpers also expect defragmented packet even without
NAT.
Noticed by Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the helper module is removed for a master connection that has a
fulfilled expectation, but has already timed out and got removed from
the hash tables, nf_conntrack_helper_unregister can't find the master
connection to unset the helper, causing a use-after-free when the
expected connection is destroyed and releases the last reference to
the master.
The helper destroy callback was introduced for the PPtP helper to clean
up expectations and expected connections when the master connection
times out, but doing this from destroy_conntrack only works for
unfulfilled expectations since expected connections hold a reference
to the master, preventing its destruction. Move the destroy callback to
the timeout function, which fixes both problems.
Reported/tested by Gabor Burjan <buga@buvoshetes.hu>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a natural extension of the changeset
[XFRM]: Probe selected algorithm only.
which only removed the probe call for xfrm_user. This patch does exactly
the same thing for af_key. In other words, we load the algorithm requested
by the user rather than everything when adding xfrm states in af_key.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
State could become inconsistent in two cases:
1) Userspace disabled FRTO by tuning sysctl when one of the TCP
flows was in the middle of FRTO algorithm (and then RTO is
again triggered)
2) SACK reneging occurs during FRTO algorithm
A simple solution is just to abort the previous FRTO when such
obscure condition occurs...
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
The conservative spurious RTO response did not queue CWR even
though the sending rate was lowered. Whenever reduction happens
regardless of reason, CWR should be sent (forgetting to send it
is not very fatal though).
A better approach would be to queue CWR when one of the sending
rate reducing responses (rate-halving one or this conservative
response) is used already at RTO. Doing that would allow CWR to
be sent along with the two new data segments that are sent
during FRTO. However, it's a bit "racy" because userland could
tune the response sysctl to a more aggressive one in between.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Compiling 2.6.22-rc1 with gcc-3.2.3 for i486 fails with:
gcc -m32 -Wp,-MD,net/core/.skbuff.o.d -nostdinc -isystem /home/mikpe/pkgs/linux-x86/gnu/lib/gcc-lib/i486-pc-linux-gnu/3.2.3/include -D__KERNEL__ -Iinclude -include include/linux/autoconf.h -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -O2 -pipe -msoft-float -mregparm=3 -freg-struct-return -mpreferred-stack-boundary=4 -march=i486 -ffreestanding -maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -Iinclude/asm-i386/mach-default -fomit-frame-pointer -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(skbuff)" -D"KBUILD_MODNAME=KBUILD_STR(skbuff)" -c -o net/core/skbuff.o net/core/skbuff.c
net/core/skbuff.c:648:1: directives may not be used inside a macro argument
net/core/skbuff.c:647:39: unterminated argument list invoking macro "memcpy"
net/core/skbuff.c: In function `pskb_expand_head':
net/core/skbuff.c:651: `memcpy' undeclared (first use in this function)
net/core/skbuff.c:651: (Each undeclared identifier is reported only once
net/core/skbuff.c:651: for each function it appears in.)
net/core/skbuff.c:651: syntax error before "skb"
make[2]: *** [net/core/skbuff.o] Error 1
make[1]: *** [net/core] Error 2
make: *** [net] Error 2
The patch below implements a simple workaround which is to
clone the offending memcpy() call and specialise it for the
two different scenarios.
Other workarounds are of course possible: e.g. bind the varying
parameter in a local variable, or use a macro or inline function
to perform the varying computation.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
coverity has spotted a bug in rfkill.c (bug id #1627),
in rfkill_allocate() NULL was returns if the kzalloc() works,
and deref the NULL pointer if it fails,
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Revert: 2d771cd86d
This is dangerous if enabled and a better solution to the
problem is being worked on.
Signed-off-by: David S. Miller <davem@davemloft.net>
As mentioned in http://bugzilla.kernel.org/show_bug.cgi?id=5015
The helptext implies that this is on by default.
This may be true on some distros (Fedora/RHEL have it enabled
in /etc/sysctl.conf), but the kernel defaults to it off.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add more comments to describe our version of tcp_slow_start().
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We presently use lock_sock() to acquire a lock on a socket in
hci_sock_dev_event(), but this goes BUG because lock_sock()
can sleep and we're already holding a read-write spinlock at
that point. So, we must use the non-sleeping BH version,
bh_lock_sock().
However, hci_sock_dev_event() is called from user context and
hence using simply bh_lock_sock() will deadlock against a
concurrent softirq that tries to acquire a lock on the same
socket. Hence, disabling BH's before acquiring the socket lock
and enable them afterwards, is the proper solution to fix
socket locking in hci_sock_dev_event().
Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
After initializing dev->_xmit_lock register_netdevice()
sets lockdep class according to dev->type.
Idea of this patch - by David Miller.
Reported & tested by: "Yuriy N. Shkandybin" <jura@netams.com>
Signed-off-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function ipxrtr_route_packet() takes a 'len' argument of type
size_t. However, its prototype in af_ipx.c incorrectly suggests that the
corresponding argument is of type 'int' instead.
Discovered by building with --combine and letting the compiler see it
all at once.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Steven French <sfrench@us.ibm.com>
Cc: Michael Halcrow <mhalcrow@us.ibm.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Anton Altaparmakov <aia21@cantab.net>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@ucw.cz>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- net/sunrpc/xprtsock.c:1635:5: warning: symbol 'init_socket_xprt' was not
declared. Should it be static?
- net/sunrpc/xprtsock.c:1649:6: warning: symbol 'cleanup_socket_xprt' was
not declared. Should it be static?
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
rpciod_running is not used at all, but due to the way DECLARE_MUTEX_LOCKED
works we don't get a warning for it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This displays the statistics specified in the updated IP-MIB RFC
(RFC4293) in /proc/net/netstat. The reason why these are not displayed
in /proc/net/snmp is that some existing utilities are developed under
the assumption which ipstat items in /proc/net/snmp is unchanged.
Signed-off-by: Mitsuru Chinen <mitch@linux.vnet.ibm.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reverse the sense of the promiscuous-mode tests in ip6_mc_input().
Signed-off-by: Corey Mutter <crm-netdev@mutternet.com>
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes an out-of-boundary condition when the classified
band equals q->bands. Caught by Alexey
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Multi-page allocations are always likely to fail. Since such failures
are expected and non-critical in xfrm_hash_alloc, we shouldn't warn about
them.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function xfrm_policy_byid takes a dir argument but finds the policy
using the index instead. We only use the dir argument to update the
policy count for that direction. Since the user can supply any value
for dir, this can corrupt our policy count.
I know this is the problem because a few days ago I was deleting
policies by hand using indicies and accidentally typed in the wrong
direction. It still deleted the policy and at the time I thought
that was cool. In retrospect it isn't such a good idea :)
I decided against letting it delete the policy anyway just in case
we ever remove the connection between indicies and direction.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'upstream-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jikos/hid:
USB HID: hiddev - fix race between hiddev_send_event() and hiddev_release()
HID: add hooks for getkeycode() and setkeycode() methods
HID: switch to using input_dev->dev.parent
USB HID: Logitech wheel 0x046d/0xc294 needs HID_QUIRK_NOGET quirk
USB HID: usb_buffer_free() cleanup
USB HID: report descriptor of Cypress USB barcode readers needs fixup
Bluetooth HID: HIDP - don't initialize force feedback
USB HID: update CONFIG_USB_HIDINPUT_POWERBOOK description
HID: add input mappings for non-working keys on Logitech S510 remote
iptables matches and targets expect packets to have at least a full
IP header and a valid header length. Ignore packets sent through
raw sockets for which this isn't true as in the other tables.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some helpers (eg. ftp) assume that private area in conntrack is
filled with zero. It should be cleared when helper is changed.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch
- Clears private area for helper even if no helper is assigned to
conntrack. It might be used by old helper.
- Unchanges if the same helper as the used one is specified.
- Does not find helper if no helper is specified. And it does not
require private area for helper in that case.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
nf_nat_rule_find, alloc_null_binding and alloc_null_binding_confirmed
do not use the argument 'info', which is actually ct->nat.info.
If they are necessary to access it again, we can use the argument 'ct'
instead.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
- move arp_tables initial table structure definitions to arp_tables.h
similar to ip_tables and ip6_tables
- use C99 initializers
- use initializer macros where possible
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we relinquish queue_lock in qdisc_restart and then retake it for
requeueing, we might race against dev_deactivate and end up requeueing
onto noop_qdisc. This causes a warning to be printed.
This patch fixes this by checking this before we requeue. As an added
bonus, we can remove the same check in __qdisc_run which was added to
prevent dev->gso_skb from being requeued when we're shutting down.
Even though we've had to add a new conditional in its place, it's better
because it only happens on requeues rather than every single time that
qdisc_run is called.
For this to work we also need to move the clearing of gso_skb up in
dev_deactivate as now qdisc_restart can occur even after we wait for
__LINK_STATE_QDISC_RUNNING to clear (but it won't do anything as long
as the queue and gso_skb is already clear).
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that we return the queue length after NETDEV_TX_OK we better
make sure that we have the right queue. Otherwise we can cause a
stall after a really quick dev_deactive/dev_activate.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current return value scheme and associated comment was invented
back in the 20th century when we still had that tbusy flag. Things
have changed quite a bit since then (even Tony Blair is moving on
now, not to mention the new French president).
All we need to indicate now is whether the caller should continue
processing the queue. Therefore it's sufficient if we return 0 if
we want to stop and non-zero otherwise.
This is based on a patch by Krishna Kumar.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
When transmit fails with NETDEV_TX_LOCKED the skb is requeued
to dev->qdisc again. The dev->qdisc pointer is protected by
the queue lock which needs to be dropped when attempting to
transmit and acquired again before requeing. The problem is
that qdisc_restart() fetches the dev->qdisc pointer once and
stores it in the `q' variable which is invalidated when
dropping the queue_lock, therefore the variable needs to be
refreshed before requeueing.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
__udp_lib_port_inuse() cannot make direct references to
inet_sk(sk)->rcv_saddr as that is ipv4 specific state and
this code is used by ipv6 too.
Use an operations vector to solve this, and this also paves
the way for ipv6 support for non-wild saddr hashing in UDP.
Signed-off-by: David S. Miller <davem@davemloft.net>
I think this is less critical, but is also suitable for -stable
release.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because skb->dst is assigned in ip6_route_input(), it is really
bad to use it in hop-by-hop option handler(s).
Closes: Bug #8450 (Eric Sesterhenn <snakebyte@gmx.de>)
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When an IPv6 router is forwarding a packet with a link-local scope source
address off-link, RFC 4007 requires it to send an ICMPv6 destination
unreachable with code 2 ("not neighbor"), but Linux doesn't. Fix below.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The socket API draft is unclear about whether to include the
chunk header or not. Recent discussion on the sctp implementors
mailing list clarified that the chunk header shouldn't be included,
but the error parameter header still needs to be there.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I broke the non-wildcard case recently. This is to fixes it.
Now, explictitly bound addresses can ge retrieved using the API.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SCTP was checking for NULL when trying to detect hmac
allocation failure where it should have been using IS_ERR.
Also, print a rate limited warning to the log telling the
user what happend.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Urgent events may be delayed if we already have a non-urgent event
queued for that device. This patch changes this by making sure that
an urgent event is always looked at immediately.
I've replaced the LW_RUNNING flag by LW_URGENT since whether work
is scheduled is already kept track by the work queue system.
The only complication is that we have to provide some exclusion for
the setting linkwatch_nextevent which is available in the actual
work function.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the jiffies wrap around or when the system boots up for the first
time, down events can be delayed indefinitely since we no longer
update linkwatch_nextevent when only urgent events are processed.
This patch fixes this by setting linkwatch_nextevent when a
wrap-around occurs.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Optimize teql_enqueue so that it first checks limits before enqueing.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
| CC net/mac80211/ieee80211_sta.o
| In file included from linux/net/mac80211/ieee80211_sta.c:31:
| include2/asm/delay.h: In function '__const_udelay':
| include2/asm/delay.h:33: error: 'loops_per_jiffy' undeclared (first use in this function)
| include2/asm/delay.h:33: error: (Each undeclared identifier is reported only once
| include2/asm/delay.h:33: error: for each function it appears in.)
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently all link carrier events are delayed by up to a second
before they're processed to prevent link storms. This causes
unnecessary packet loss during that interval.
In fact, we can achieve the same effect in preventing storms by
only delaying down events and unnecssary up events. The latter
is defined as up events when we're already up.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
These days the link watch mechanism is an integral part of the
network subsystem as it manages the carrier status. So it now
makes sense to allocate some memory for it in net_device rather
than allocating it on demand.
In fact, this is necessary because we can't tolerate a memory
allocation failure since that means we'd have to potentially
throw a link up event away.
It also simplifies the code greatly.
In doing so I discovered a subtle race condition in the use
of singleevent. This race condition still exists (and is
somewhat magnified) without singleevent but it's now plugged
thanks to an smp_mb__before_clear_bit.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for struct class_device -> struct device input core
conversion, switch to using input_dev->dev.parent when specifying
device position in sysfs tree.
Also, do not access input_dev->private directly, use helpers and
do not use kfree() on input device, use input_free_device() instead.
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] update default configuration.
[S390] Kconfig: no wireless on s390.
[S390] Kconfig: use common Kconfig files for s390.
[S390] Kconfig: common config options for s390.
[S390] Kconfig: unwanted menus for s390.
[S390] Kconfig: menus with depends on HAS_IOMEM.
[S390] Kconfig: refine depends statements.
[S390] Avoid compile warning.
[S390] qdio: re-add lost perf_stats.tl_runs change in qdio_handle_pci
[S390] Avoid sparse warnings.
[S390] dasd: Fix modular build.
[S390] monreader inlining cleanup.
[S390] cio: Make some structures and a function static.
[S390] cio: Get rid of _ccw_device_get_device_number().
[S390] fix subsystem removal fallout
Disable some more menus in the configuration files that are of no
interest to a s390 machine.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
While the comment says:
* To prevent rpciod from hanging, this allocator never sleeps,
* returning NULL if the request cannot be serviced immediately.
The function does not actually check for NULL pointers being returned.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Use a cleaner method to find the size of an rpc_buffer. This actually
works on x86-64!
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits)
sound: convert "sound" subdirectory to UTF-8
MAINTAINERS: Add cxacru website/mailing list
include files: convert "include" subdirectory to UTF-8
general: convert "kernel" subdirectory to UTF-8
documentation: convert the Documentation directory to UTF-8
Convert the toplevel files CREDITS and MAINTAINERS to UTF-8.
remove broken URLs from net drivers' output
Magic number prefix consistency change to Documentation/magic-number.txt
trivial: s/i_sem /i_mutex/
fix file specification in comments
drivers/base/platform.c: fix small typo in doc
misc doc and kconfig typos
Remove obsolete fat_cvf help text
Fix occurrences of "the the "
Fix minor typoes in kernel/module.c
Kconfig: Remove reference to external mqueue library
Kconfig: A couple of grammatical fixes in arch/i386/Kconfig
Correct comments in genrtc.c to refer to correct /proc file.
Fix more "deprecated" spellos.
Fix "deprecated" typoes.
...
Fix trivial comment conflict in kernel/relay.c.
Since nonboot CPUs are now disabled after tasks and devices have been
frozen and the CPU hotplug infrastructure is used for this purpose, we need
special CPU hotplug notifications that will help the CPU-hotplug-aware
subsystems distinguish normal CPU hotplug events from CPU hotplug events
related to a system-wide suspend or resume operation in progress. This
patch introduces such notifications and causes them to be used during
suspend and resume transitions. It also changes all of the
CPU-hotplug-aware subsystems to take these notifications into consideration
(for now they are handled in the same way as the corresponding "normal"
ones).
[oleg@tv-sign.ru: cleanups]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This while loop has an overly complex condition, which performs a couple of
assignments. This hurts readability.
We don't really need a loop at all. We can just return -EAGAIN and (providing
we set SK_DATA), the function will be called again.
So discard the loop, make the complex conditional become a few clear function
calls, and hopefully improve readability.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If I send a RPC_GSS_PROC_DESTROY message to NFSv4 server, it will reply with a
bad rpc reply which lacks an authentication verifier. Maybe this patch is
needed.
Send/recv packets as following:
send:
RemoteProcedureCall
xid
rpcvers = 2
prog = 100003
vers = 4
proc = 0
cred = AUTH_GSS
version = 1
gss_proc = 3 (RPCSEC_GSS_DESTROY)
service = 1 (RPC_GSS_SVC_NONE)
verf = AUTH_GSS
checksum
reply:
RemoteProcedureReply
xid
msg_type
reply_stat
accepted_reply
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I have been investigating a module reference count leak on the server for
rpcsec_gss_krb5.ko. It turns out the problem is a reference count leak for
the security context in net/sunrpc/auth_gss/svcauth_gss.c.
The problem is that gss_write_init_verf() calls gss_svc_searchbyctx() which
does a rsc_lookup() but never releases the reference to the context. There is
another issue that rpc.svcgssd sets an "end of time" expiration for the
context
By adding a cache_put() call in gss_svc_searchbyctx(), and setting an
expiration timeout in the downcall, cache_clean() does clean up the context
and the module reference count now goes to zero after unmount.
I also verified that if the context expires and then the client makes a new
request, a new context is established.
Here is the patch to fix the kernel, I will start a separate thread to discuss
what expiration time should be set by rpc.svcgssd.
Acked-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's not necessarily correct to assume that the xdr_buf used to hold the
server's reply must have page data whenever it has tail data.
And there's no need for us to deal with that case separately anyway.
Acked-by: "J. Bruce Fields" <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
register_rpc_pipefs() needs to clean up rpc_inode_cache
by kmem_cache_destroy() on register_filesystem() failure.
init_sunrpc() needs to unregister rpc_pipe_fs by unregister_rpc_pipefs()
when rpc_init_mempool() returns error.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the kernel calls svc_reserve to downsize the expected size of an RPC
reply, it fails to account for the possibility of a checksum at the end of
the packet. If a client mounts a NFSv2/3 with sec=krb5i/p, and does I/O
then you'll generally see messages similar to this in the server's ring
buffer:
RPC request reserved 164 but used 208
While I was never able to verify it, I suspect that this problem is also
the root cause of some oopses I've seen under these conditions:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=227726
This is probably also a problem for other sec= types and for NFSv4. The
large reserved size for NFSv4 compound packets seems to generally paper
over the problem, however.
This patch adds a wrapper for svc_reserve that accounts for the possibility
of a checksum. It also fixes up the appropriate callers of svc_reserve to
call the wrapper. For now, it just uses a hardcoded value that I
determined via testing. That value may need to be revised upward as things
change, or we may want to eventually add a new auth_op that attempts to
calculate this somehow.
Unfortunately, there doesn't seem to be a good way to reliably determine
the expected checksum length prior to actually calculating it, particularly
with schemes like spkm3.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Acked-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that sk_defer_lock protects two different things, make the name more
generic.
Also don't bother with disabling _bh as the lock is only ever taken from
process context.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
flush_work(wq, work) doesn't need the first parameter, we can use cwq->wq
(this was possible from the very beginnig, I missed this). So we can unify
flush_work_keventd and flush_work.
Also, rename flush_work() to cancel_work_sync() and fix all callers.
Perhaps this is not the best name, but "flush_work" is really bad.
(akpm: this is why the earlier patches bypassed maintainers)
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Auke Kok <auke-jan.h.kok@intel.com>,
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
net/ipv4/ipvs/ip_vs_core.c
module_exit
ip_vs_cleanup
ip_vs_control_cleanup
cancel_rearming_delayed_work
// done
This is unsafe. The module may be unloaded and the memory may be freed
while defense_work's handler is still running/preempted.
Do flush_work(&defense_work.work) after cancel_rearming_delayed_work().
Alternatively, we could add flush_work() to cancel_rearming_delayed_work(),
but note that we can't change cancel_delayed_work() in the same manner
because it may be called from atomic context.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current implementation of force feedback for HID devices is
USB-transport only and therefore calling hid_ff_init() from hidp code is
not going to work (plus it creates unwanted dependency of hidp on usbhid).
Remove the hid_ff_init() until either the hid-ff is made
transport-independent, or at least support for bluetooth transport is
added.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>