Conntrack creation through ctnetlink has two races:
- the timer may expire and free the conntrack concurrently, causing an
invalid memory access when attempting to put it in the hash tables
- an identical conntrack entry may be created in the packet processing
path in the time between the lookup and hash insertion
Hold the conntrack lock between the lookup and insertion to avoid this.
Reported-by: Zoltan Borbely <bozo@andrews.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The svc_addsock function adds transport instances without taking a
reference on the sunrpc.ko module, however, the generic transport
destruction code drops a reference when a transport instance
is destroyed.
Add a try_module_get call to the svc_addsock function for transport
instances added by this function.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tested-by: Jeff Moyer <jmoyer@redhat.com>
If the slub allocator is used, kmem_cache_create() may merge two or more
kmem_cache's into one but the cache name pointer is not updated and
kmem_cache_name() is no longer guaranteed to return the pointer passed
to the former function. This patch stores the kmalloc'ed pointers in the
corresponding request_sock_ops and timewait_sock_ops structures.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=12014
Since most (if not all) implementations of TSO and even the in-kernel
software GSO do not update the urgent pointer when splitting a large
segment, it is necessary to turn off TSO/GSO for all outgoing traffic
with the URG pointer set.
Looking at tcp_current_mss (and the preceding comment) I even think
this was the original intention. However, this approach is insufficient,
because TSO/GSO is turned off only for newly created frames, not for
frames which were already pending at the arrival of a message with
MSG_OOB set. These frames were created when TSO/GSO was enabled,
so they may be large, and they will have the urgent pointer set
in tcp_transmit_skb().
With this patch, such large packets will be fragmented again before
going to the transmit routine.
As a side note, at least the following NICs are known to screw up
the urgent pointer in the TCP header when doing TSO:
Intel 82566MM (PCI ID 8086:1049)
Intel 82566DC (PCI ID 8086:104b)
Intel 82541GI (PCI ID 8086:1076)
Broadcom NetXtreme II BCM5708 (PCI ID 14e4:164c)
Signed-off-by: Petr Tesarik <ptesarik@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a regression reported by Max Kellermann whereby kernel profiling
showed that his clients were spending 45% of their time in
rpcauth_lookup_credcache.
It turns out that although his processes had identical uid/gid/groups,
generic_match() was failing to detect this, because the task->group_info
pointers were not shared. This again lead to the creation of a huge number
of identical credentials at the RPC layer.
The regression is fixed by comparing the contents of task->group_info
if the actual pointers are not identical.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits)
net: fix tiny output corruption of /proc/net/snmp6
atl2: don't request irq on resume if netif running
ipv6: use seq_release_private for ip6mr.c /proc entries
pkt_sched: fix missing check for packet overrun in qdisc_dump_stab()
smc911x: Fix printf format typo in smc911x driver.
asix: Fix asix-based cards connecting to 10/100Mbs LAN.
mv643xx_eth: fix recycle check bound
mv643xx_eth: fix the order of mdiobus_{unregister, free}() calls
sh: sh_eth: Update to change of mii_bus
TPROXY: supply a struct flowi->flags argument in inet_sk_rebuild_header()
TPROXY: fill struct flowi->flags in udp_sendmsg()
net: ipg.c fix bracing on endian swapping
phylib: Fix auto-negotiation restart avoidance
net: jme.c rxdesc.flags is __le16, other missing endian swaps
phylib: fix phy name example in documentation
net: Do not fire linkwatch events until the device is registered.
phonet: fix compilation with gcc-3.4
ixgbe: fix compilation with gcc-3.4
pktgen: fix multiple queue warning
net: fix ip_mr_init() error path
...
Because "name" is static, it can be occasionally be filled with
somewhat garbage if two processes read /proc/net/snmp6.
Also, remove useless casts and "-1" -- snprintf() correctly terminates it's
output.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In ip6mr.c, /proc entries /proc/net/ip6_mr_cache and /proc/net/ip6_mr_vif
are opened with seq_open_private(), thus seq_release_private() should be
used to release them.
Should fix a small memory leak.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
nla_nest_start() might return NULL, causing a NULL pointer dereference.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
inet_sk_rebuild_header() does a new route lookup if the dst_entry
associated with a socket becomes stale. However inet_sk_rebuild_header()
didn't use struct flowi->flags, causing the route lookup to
fail for foreign-bound IP_TRANSPARENT sockets, causing an error
state to be set for the sockets in question.
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
udp_sendmsg() didn't fill struct flowi->flags, which means that
the route lookup would fail for non-local IPs even if the
IP_TRANSPARENT sockopt was set.
This prevents sendto() to work properly for UDP sockets, whereas
bind(foreign-ip) + connect() + send() worked fine.
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a new accept4() system call. The addition of this system call
matches analogous changes in 2.6.27 (dup3(), evenfd2(), signalfd4(),
inotify_init1(), epoll_create1(), pipe2()) which added new system calls
that differed from analogous traditional system calls in adding a flags
argument that can be used to access additional functionality.
The accept4() system call is exactly the same as accept(), except that
it adds a flags bit-mask argument. Two flags are initially implemented.
(Most of the new system calls in 2.6.27 also had both of these flags.)
SOCK_CLOEXEC causes the close-on-exec (FD_CLOEXEC) flag to be enabled
for the new file descriptor returned by accept4(). This is a useful
security feature to avoid leaking information in a multithreaded
program where one thread is doing an accept() at the same time as
another thread is doing a fork() plus exec(). More details here:
http://udrepper.livejournal.com/20407.html "Secure File Descriptor Handling",
Ulrich Drepper).
The other flag is SOCK_NONBLOCK, which causes the O_NONBLOCK flag
to be enabled on the new open file description created by accept4().
(This flag is merely a convenience, saving the use of additional calls
fcntl(F_GETFL) and fcntl (F_SETFL) to achieve the same result.
Here's a test program. Works on x86-32. Should work on x86-64, but
I (mtk) don't have a system to hand to test with.
It tests accept4() with each of the four possible combinations of
SOCK_CLOEXEC and SOCK_NONBLOCK set/clear in 'flags', and verifies
that the appropriate flags are set on the file descriptor/open file
description returned by accept4().
I tested Ulrich's patch in this thread by applying against 2.6.28-rc2,
and it passes according to my test program.
/* test_accept4.c
Copyright (C) 2008, Linux Foundation, written by Michael Kerrisk
<mtk.manpages@gmail.com>
Licensed under the GNU GPLv2 or later.
*/
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <stdlib.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#define PORT_NUM 33333
#define die(msg) do { perror(msg); exit(EXIT_FAILURE); } while (0)
/**********************************************************************/
/* The following is what we need until glibc gets a wrapper for
accept4() */
/* Flags for socket(), socketpair(), accept4() */
#ifndef SOCK_CLOEXEC
#define SOCK_CLOEXEC O_CLOEXEC
#endif
#ifndef SOCK_NONBLOCK
#define SOCK_NONBLOCK O_NONBLOCK
#endif
#ifdef __x86_64__
#define SYS_accept4 288
#elif __i386__
#define USE_SOCKETCALL 1
#define SYS_ACCEPT4 18
#else
#error "Sorry -- don't know the syscall # on this architecture"
#endif
static int
accept4(int fd, struct sockaddr *sockaddr, socklen_t *addrlen, int flags)
{
printf("Calling accept4(): flags = %x", flags);
if (flags != 0) {
printf(" (");
if (flags & SOCK_CLOEXEC)
printf("SOCK_CLOEXEC");
if ((flags & SOCK_CLOEXEC) && (flags & SOCK_NONBLOCK))
printf(" ");
if (flags & SOCK_NONBLOCK)
printf("SOCK_NONBLOCK");
printf(")");
}
printf("\n");
#if USE_SOCKETCALL
long args[6];
args[0] = fd;
args[1] = (long) sockaddr;
args[2] = (long) addrlen;
args[3] = flags;
return syscall(SYS_socketcall, SYS_ACCEPT4, args);
#else
return syscall(SYS_accept4, fd, sockaddr, addrlen, flags);
#endif
}
/**********************************************************************/
static int
do_test(int lfd, struct sockaddr_in *conn_addr,
int closeonexec_flag, int nonblock_flag)
{
int connfd, acceptfd;
int fdf, flf, fdf_pass, flf_pass;
struct sockaddr_in claddr;
socklen_t addrlen;
printf("=======================================\n");
connfd = socket(AF_INET, SOCK_STREAM, 0);
if (connfd == -1)
die("socket");
if (connect(connfd, (struct sockaddr *) conn_addr,
sizeof(struct sockaddr_in)) == -1)
die("connect");
addrlen = sizeof(struct sockaddr_in);
acceptfd = accept4(lfd, (struct sockaddr *) &claddr, &addrlen,
closeonexec_flag | nonblock_flag);
if (acceptfd == -1) {
perror("accept4()");
close(connfd);
return 0;
}
fdf = fcntl(acceptfd, F_GETFD);
if (fdf == -1)
die("fcntl:F_GETFD");
fdf_pass = ((fdf & FD_CLOEXEC) != 0) ==
((closeonexec_flag & SOCK_CLOEXEC) != 0);
printf("Close-on-exec flag is %sset (%s); ",
(fdf & FD_CLOEXEC) ? "" : "not ",
fdf_pass ? "OK" : "failed");
flf = fcntl(acceptfd, F_GETFL);
if (flf == -1)
die("fcntl:F_GETFD");
flf_pass = ((flf & O_NONBLOCK) != 0) ==
((nonblock_flag & SOCK_NONBLOCK) !=0);
printf("nonblock flag is %sset (%s)\n",
(flf & O_NONBLOCK) ? "" : "not ",
flf_pass ? "OK" : "failed");
close(acceptfd);
close(connfd);
printf("Test result: %s\n", (fdf_pass && flf_pass) ? "PASS" : "FAIL");
return fdf_pass && flf_pass;
}
static int
create_listening_socket(int port_num)
{
struct sockaddr_in svaddr;
int lfd;
int optval;
memset(&svaddr, 0, sizeof(struct sockaddr_in));
svaddr.sin_family = AF_INET;
svaddr.sin_addr.s_addr = htonl(INADDR_ANY);
svaddr.sin_port = htons(port_num);
lfd = socket(AF_INET, SOCK_STREAM, 0);
if (lfd == -1)
die("socket");
optval = 1;
if (setsockopt(lfd, SOL_SOCKET, SO_REUSEADDR, &optval,
sizeof(optval)) == -1)
die("setsockopt");
if (bind(lfd, (struct sockaddr *) &svaddr,
sizeof(struct sockaddr_in)) == -1)
die("bind");
if (listen(lfd, 5) == -1)
die("listen");
return lfd;
}
int
main(int argc, char *argv[])
{
struct sockaddr_in conn_addr;
int lfd;
int port_num;
int passed;
passed = 1;
port_num = (argc > 1) ? atoi(argv[1]) : PORT_NUM;
memset(&conn_addr, 0, sizeof(struct sockaddr_in));
conn_addr.sin_family = AF_INET;
conn_addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
conn_addr.sin_port = htons(port_num);
lfd = create_listening_socket(port_num);
if (!do_test(lfd, &conn_addr, 0, 0))
passed = 0;
if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, 0))
passed = 0;
if (!do_test(lfd, &conn_addr, 0, SOCK_NONBLOCK))
passed = 0;
if (!do_test(lfd, &conn_addr, SOCK_CLOEXEC, SOCK_NONBLOCK))
passed = 0;
close(lfd);
exit(passed ? EXIT_SUCCESS : EXIT_FAILURE);
}
[mtk.manpages@gmail.com: rewrote changelog, updated test program]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Tested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several device drivers try to do things like netif_carrier_off()
before register_netdev() is invoked. This is bogus, but too many
drivers do this to fix them all up in one go.
Reported-by: Folkert van Heusden <folkert@vanheusden.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CC [M] net/phonet/af_phonet.o
net/phonet/af_phonet.c: In function `pn_socket_create':
net/phonet/af_phonet.c:38: sorry, unimplemented: inlining failed in call to 'phonet_proto_put': function body not available
net/phonet/af_phonet.c:99: sorry, unimplemented: called from here
make[3]: *** [net/phonet/af_phonet.o] Error 1
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As number of TX queues in unrelated to number of CPU's we remove this test
and just make sure nxtq never gets exceeded.
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similarly to IPv6 ip6_mr_init() (fixed last week), the order of cleanup
operations in the error/exit section of ip_mr_init() is completely
inversed. It should be the other way around.
Also a del_timer() is missing in the error path.
I should have guessed last week that this same error existed in ipmr.c
too, as ip6mr.c is largely inspired by ipmr.c.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before ieee80211_notify_mac() was added, it was presented with the
use case of using it to tell mac80211 that the association may
have been lost because the firmware crashed/reset.
Since then, it has also been used by iwlwifi to (slightly) speed
up re-association after resume, a workaround around the fact that
mac80211 has no suspend/resume handling yet. It is also not used
by any other drivers, so clearly it cannot be necessary for "good
enough" suspend/resume.
Unfortunately, the callback suffers from a severe problem: It only
works for station mode. If suspend/resume happens while in IBSS or
any other mode (but station), then the callback is pointless.
Recently, it has created a number of locking issues, first because
it required rtnl locking rather than RCU due to calling sleeping
functions within the critical section, and now because it's called
by iwlwifi from the mac80211 workqueue that may not use the rtnl
because it is flushed under rtnl.
(cf. http://bugzilla.kernel.org/show_bug.cgi?id=12046)
I think, therefore, that we should take a step back, remove it
entirely for now and add the small feature it provided properly.
For suspend and resume we will need to introduce new hooks, and for
the case where the firmware was reset the driver will probably
simply just pretend it has done a suspend/resume cycle to get
mac80211 to reprogram the hardware completely, not just try to
connect to the current AP again in station mode. When doing so, we
will need to take into account locking issues and possibly defer
to schedule_work from within mac80211 for the resume operation,
while the suspend operation must be done directly.
Proper suspend/resume should also not necessarily try to reconnect
to the current AP, the time spent in suspend may have been short
enough to not be disconnected from the AP, mac80211 will detect
that the AP went out of range quickly if it did, and if the
association is lost then the AP will disassoc as soon as a data
frame is sent. We might also take into account WWOL then, and
have mac80211 program the hardware into such a mode where it is
available and requested.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Unlike ifconfig, iproute doesn't report an error when setting
an interface up fails:
(example: put wireless network mac80211 interface into repeater mode
with iwconfig but do not set a peer MAC address, it should fail with
-ENOLINK)
without patch:
# ip link set wlan0 up ; echo $?
0
#
with patch:
# ip link set wlan0 up ; echo $?
RTNETLINK answers: Link has been severed
2
#
Propagate the return value from dev_change_flags() to fix this.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Tested-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the next page of the scm recursion story (the commit
f8d570a4 net: Fix recursive descent in __scm_destroy()).
In function scm_fp_dup(), the INIT_LIST_HEAD(&fpl->list) of newly
created fpl is done *before* the subsequent memcpy from the old
structure and thus the freshly initialized list is overwritten.
But that's OK, since this initialization is not required at all,
since the fpl->list is list_add-ed at the destruction time in any
case (and is unused in other code), so I propose to drop both
initializations, rather than moving it after the memcpy.
Please, correct me if I miss something significant.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
fix this warning:
net/bluetooth/af_bluetooth.c:60: warning: ‘bt_key_strings’ defined but not used
net/bluetooth/af_bluetooth.c:71: warning: ‘bt_slock_key_strings’ defined but not used
this is a lockdep macro problem in the !LOCKDEP case.
We cannot convert it to an inline because the macro works on multiple types,
but we can mark the parameter used.
[ also clean up a misaligned tab in sock_lock_init_class_and_name() ]
[ also remove #ifdefs from around af_family_clock_key strings - which
were certainly added to get rid of the ugly build warnings. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make 9p's RDMA option depend on INET since it uses Infiniband rdma_*
functions and that code depends on INET. Otherwise 9p can try to
use symbols which don't exist.
ERROR: "rdma_destroy_id" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_connect" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_create_id" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_create_qp" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_resolve_route" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_disconnect" [net/9p/9pnet_rdma.ko] undefined!
ERROR: "rdma_resolve_addr" [net/9p/9pnet_rdma.ko] undefined!
I used an if/endif block so that the menu items would remain
presented together.
Also correct an article adjective.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Failure to pass netns_ok check is SILENT, except some MIB counter is
incremented somewhere.
And adding "netns_ok = 1" (after long head-scratching session) is
usually the last step in making some protocol netns-ready...
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes two bugs:
1. setsockopt() of anything but a Type 2 routing header should return
EINVAL instead of EPERM. Noticed by Shan Wei
(shanwei@cn.fujitsu.com).
2. setsockopt()/sendmsg() of a Type 2 routing header with invalid
length or segments should return EINVAL. These values are statically
fixed in RFC 3775, unlike the variable Type 0 was.
Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ieee80211_notify_mac() function uses ieee80211_sta_req_auth() which
in turn calls ieee80211_set_disassoc() which calls a few functions that
need to be able to sleep, so ieee80211_notify_mac() cannot use RCU
locking for the interface list and must use rtnl locking instead.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In __sock_recv_timestamp() the additional SCM_TIMESTAMP[NS] is used. This
has the same value as SO_TIMESTAMP[NS], so this is a purely cosmetic change.
Signed-off-by: Patrick Ohly <patrick.ohly@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a minor bug in tcp_htcp.c which has been
highlighted by Lachlan Andrew and Lawrence Stewart. Currently, the
time since the last congestion event, which is stored in variable
last_cong, is reset whenever there is a state change into
TCP_CA_Open. This includes transitions of the type
TCP_CA_Open->TCP_CA_Disorder->TCP_CA_Open which are not associated
with backoff of cwnd. The patch changes last_cong to be updated
only on transitions into TCP_CA_Open that occur after experiencing
the congestion-related states TCP_CA_Loss, TCP_CA_Recovery,
TCP_CA_CWR.
Signed-off-by: Doug Leith <doug.leith@nuim.ie>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
dsa: fix master interface allmulti/promisc handling
dsa: fix skb->pkt_type when mac address of slave interface differs
net: fix setting of skb->tail in skb_recycle_check()
net: fix /proc/net/snmp as memory corruptor
mac80211: fix a buffer overrun in station debug code
netfilter: payload_len is be16, add size of struct rather than size of pointer
ipv6: fix ip6_mr_init error path
[4/4] dca: fixup initialization dependency
[3/4] I/OAT: fix async_tx.callback checking
[2/4] I/OAT: fix dma_pin_iovec_pages() error handling
[1/4] I/OAT: fix channel resources free for not allocated channels
ssb: Fix DMA-API compilation for non-PCI systems
SSB: hide empty sub menu
vlan: Fix typos in proc output string
[netdrvr] usb/hso: Cleanup rfkill error handling
sfc: Correct address of gPXE boot configuration in EEPROM
el3_common_init() should be __devinit, not __init
hso: rfkill type should be WWAN
mlx4_en: Start port error flow bug fix
af_key: mark policy as dead before destroying
Before commit b6c40d68ff ("net: only
invoke dev->change_rx_flags when device is UP"), the dsa driver could
sort-of get away with only fiddling with the master interface's
allmulti/promisc counts in ->change_rx_flags() and not touching them
in ->open() or ->stop(). After this commit (note that it was merged
almost simultaneously with the dsa patches, which is why this wasn't
caught initially), the breakage that was already there became more
apparent.
Since it makes no sense to keep the master interface's allmulti or
promisc count pinned for a slave interface that is down, copy the vlan
driver's sync logic (which does exactly what we want) over to dsa to
fix this.
Bug report from Dirk Teurlings <dirk@upexia.nl> and Peter van Valderen
<linux@ddcrew.com>.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Tested-by: Dirk Teurlings <dirk@upexia.nl>
Tested-by: Peter van Valderen <linux@ddcrew.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a dsa slave interface has a mac address that differs from that
of the master interface, eth_type_trans() won't explicitly set
skb->pkt_type back to PACKET_HOST -- we need to do this ourselves
before calling eth_type_trans().
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since skb_reset_tail_pointer() reads skb->data, we need to set
skb->data before calling skb_reset_tail_pointer(). This was causing
spurious skb_over_panic()s from skb_put() being called on a recycled
skb that had its skb->tail set to beyond where it should have been.
Bug report from Peter van Valderen <linux@ddcrew.com>.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
icmpmsg_put() can happily corrupt kernel memory, using a static
table and forgetting to reset an array index in a loop.
Remove the static array since its not safe without proper locking.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/mac80211/debugfs_sta.c
The trailing zero was written to state[4], it's out of bounds.
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
payload_len is a be16 value, not cpu_endian, also the size of a ponter
to a struct ipv6hdr was being added, not the size of the struct itself.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The order of cleanup operations in the error/exit section of ip6_mr_init()
is completely inversed. It should be the other way around.
Also a del_timer() is missing in the error path.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously I assumed that the receive queues of candidates don't
change during the GC. This is only half true, nothing can be received
from the queues (see comment in unix_gc()), but buffers could be added
through the other half of the socket pair, which may still have file
descriptors referring to it.
This can result in inc_inflight_move_tail() erronously increasing the
"inflight" counter for a unix socket for which dec_inflight() wasn't
previously called. This in turn can trigger the "BUG_ON(total_refs <
inflight_refs)" in a later garbage collection run.
Fix this by only manipulating the "inflight" counter for sockets which
are candidates themselves. Duplicating the file references in
unix_attach_fds() is also needed to prevent a socket becoming a
candidate for GC while the skb that contains it is not yet queued.
Reported-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
xfrm_policy_destroy() will oops if not dead policy is passed to it.
On error path in pfkey_compile_policy() exactly this happens.
Oopsable for CAP_NET_ADMIN owners.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
net: Fix recursive descent in __scm_destroy().
iwl3945: fix deadlock on suspend
iwl3945: do not send scan command if channel count zero
iwl3945: clear scanning bits upon failure
ath5k: correct handling of rx status fields
zd1211rw: Add 2 device IDs
Fix logic error in rfkill_check_duplicity
iwlagn: avoid sleep in softirq context
iwlwifi: clear scanning bits upon failure
Revert "ath5k: honor FIF_BCN_PRBRESP_PROMISC in STA mode"
tcp: Fix recvmsg MSG_PEEK influence of blocking behavior.
netfilter: netns ct: walk netns list under RTNL
ipv6: fix run pending DAD when interface becomes ready
net/9p: fix printk format warnings
net: fix packet socket delivery in rx irq handler
xfrm: Have af-specific init_tempsel() initialize family field of temporary selector
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
net/9p: fix printk format warnings
unsigned fid->fid cannot be negative
9p: rdma: remove duplicated #include
p9: Fix leak of waitqueue in request allocation path
9p: Remove unneeded free of fcall for Flush
9p: Make all client spin locks IRQ safe
9p: rdma: Set trans prior to requesting async connection ops
__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.
Those, in turn, can close sockets and invoke __scm_destroy() again.
There is nothing which limits how deeply this can occur.
The idea for how to fix this is from Linus. Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput(). Inside of the initial __scm_destroy() we
keep running the list until it is empty.
Signed-off-by: David S. Miller <davem@davemloft.net>
__scm_destroy() walks the list of file descriptors in the scm_fp_list
pointed to by the scm_cookie argument.
Those, in turn, can close sockets and invoke __scm_destroy() again.
There is nothing which limits how deeply this can occur.
The idea for how to fix this is from Linus. Basically, we do all of
the fput()s at the top level by collecting all of the scm_fp_list
objects hit by an fput(). Inside of the initial __scm_destroy() we
keep running the list until it is empty.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> I'll have a prod at why the [hso] rfkill stuff isn't working next
Ok, I believe this is due to the addition of rfkill_check_duplicity in
rfkill and the fact that test_bit actually returns a negative value
rather than the postive one expected (which is of course equally true).
So when the second WLAN device (the hso device, with the EEE PC WLAN
being the first) comes along rfkill_check_duplicity returns a negative
value and so rfkill_register returns an error. Patch below fixes this
for me.
Signed-Off-By: Jonathan McDowell <noodles@earth.li>
Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Fix printk format warnings in net/9p.
Built cleanly on 7 arches.
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Removed duplicated #include <rdma/ib_verbs.h> in
net/9p/trans_rdma.c.
Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
If a T or R fcall cannot be allocated, the function returns an error
but neglects to free the wait queue that was successfully allocated.
If it comes through again a second time this wq will be overwritten
with a new allocation and the old allocation will be leaked.
Also, if the client is subsequently closed, the close path will
attempt to clean up these allocations, so set the req fields to
NULL to avoid duplicate free.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
T and R fcall are reused until the client is destroyed. There does
not need to be a special case for Flush
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The client lock must be IRQ safe. Some of the lock acquisition paths
took regular spin locks.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
The RDMA connection manager is fundamentally asynchronous.
Since the async callback context is the client pointer, the
transport in the client struct needs to be set prior to calling
the first async op.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Vito Caputo noticed that tcp_recvmsg() returns immediately from
partial reads when MSG_PEEK is used. In particular, this means that
SO_RCVLOWAT is not respected.
Simply remove the test. And this matches the behavior of several
other systems, including BSD.
Signed-off-by: David S. Miller <davem@davemloft.net>
With some net devices types, an IPv6 address configured while the
interface was down can stay 'tentative' forever, even after the interface
is set up. In some case, pending IPv6 DADs are not executed when the
device becomes ready.
I observed this while doing some tests with kvm. If I assign an IPv6
address to my interface eth0 (kvm driver rtl8139) when it is still down
then the address is flagged tentative (IFA_F_TENTATIVE). Then, I set
eth0 up, and to my surprise, the address stays 'tentative', no DAD is
executed and the address can't be pinged.
I also observed the same behaviour, without kvm, with virtual interfaces
types macvlan and veth.
Some easy steps to reproduce the issue with macvlan:
1. ip link add link eth0 type macvlan
2. ip -6 addr add 2003::ab32/64 dev macvlan0
3. ip addr show dev macvlan0
...
inet6 2003::ab32/64 scope global tentative
...
4. ip link set macvlan0 up
5. ip addr show dev macvlan0
...
inet6 2003::ab32/64 scope global tentative
...
Address is still tentative
I think there's a bug in net/ipv6/addrconf.c, addrconf_notify():
addrconf_dad_run() is not always run when the interface is flagged IF_READY.
Currently it is only run when receiving NETDEV_CHANGE event. Looks like
some (virtual) devices doesn't send this event when becoming up.
For both NETDEV_UP and NETDEV_CHANGE events, when the interface becomes
ready, run_pending should be set to 1. Patch below.
'run_pending = 1' could be moved below the if/else block but it makes
the code less readable.
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix printk format warnings in net/9p.
Built cleanly on 7 arches.
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:820: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:867: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:932: warning: format '%llx' expects type 'long long unsigned int', but argument 6 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:982: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/9p/client.c:1025: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1227: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 7 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 12 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 8 has type 'u64'
net/9p/client.c:1252: warning: format '%llx' expects type 'long long unsigned int', but argument 13 has type 'u64'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The changes to deliver hardware accelerated VLAN packets to packet
sockets (commit bc1d0411) caused a warning for non-NAPI drivers.
The __vlan_hwaccel_rx() function is called directly from the drivers
RX function, for non-NAPI drivers that means its still in RX IRQ
context:
[ 27.779463] ------------[ cut here ]------------
[ 27.779509] WARNING: at kernel/softirq.c:136 local_bh_enable+0x37/0x81()
...
[ 27.782520] [<c0264755>] netif_nit_deliver+0x5b/0x75
[ 27.782590] [<c02bba83>] __vlan_hwaccel_rx+0x79/0x162
[ 27.782664] [<f8851c1d>] atl1_intr+0x9a9/0xa7c [atl1]
[ 27.782738] [<c0155b17>] handle_IRQ_event+0x23/0x51
[ 27.782808] [<c015692e>] handle_edge_irq+0xc2/0x102
[ 27.782878] [<c0105fd5>] do_IRQ+0x4d/0x64
Split hardware accelerated VLAN reception into two parts to fix this:
- __vlan_hwaccel_rx just stores the VLAN TCI and performs the VLAN
device lookup, then calls netif_receive_skb()/netif_rx()
- vlan_hwaccel_do_receive(), which is invoked by netif_receive_skb()
in softirq context, performs the real reception and delivery to
packet sockets.
Reported-and-tested-by: Ramon Casellas <ramon.casellas@cttc.es>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
While adding MIGRATE support to strongSwan, Andreas Steffen noticed that
the selectors provided in XFRM_MSG_ACQUIRE have their family field
uninitialized (those in MIGRATE do have their family set).
Looking at the code, this is because the af-specific init_tempsel()
(called via afinfo->init_tempsel() in xfrm_init_tempsel()) do not set
the value.
Reported-by: Andreas Steffen <andreas.steffen@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
xfrm: Fix xfrm_policy_gc_lock handling.
niu: Use pci_ioremap_bar().
bnx2x: Version Update
bnx2x: Calling netif_carrier_off at the end of the probe
bnx2x: PCI configuration bug on big-endian
bnx2x: Removing the PMF indication when unloading
mv643xx_eth: fix SMI bus access timeouts
net: kconfig cleanup
fs_enet: fix polling
XFRM: copy_to_user_kmaddress() reports local address twice
SMC91x: Fix compilation on some platforms.
udp: Fix the SNMP counter of UDP_MIB_INERRORS
udp: Fix the SNMP counter of UDP_MIB_INDATAGRAMS
drivers/net/smc911x.c: Fix lockdep warning on xmit.
From: Alexey Dobriyan <adobriyan@gmail.com>
Based upon a lockdep trace by Simon Arlott.
xfrm_policy_kill() can be called from both BH and
non-BH contexts, so we have to grab xfrm_policy_gc_lock
with BH disabling.
Signed-off-by: David S. Miller <davem@davemloft.net>
While adding support for MIGRATE/KMADDRESS in strongSwan (as specified
in draft-ebalard-mext-pfkey-enhanced-migrate-00), Andreas Steffen
noticed that XFRMA_KMADDRESS attribute passed to userland contains the
local address twice (remote provides local address instead of remote
one).
This bug in copy_to_user_kmaddress() affects only key managers that use
native XFRM interface (key managers that use PF_KEY are not affected).
For the record, the bug was in the initial changeset I posted which
added support for KMADDRESS (13c1d18931
'xfrm: MIGRATE enhancements (draft-ebalard-mext-pfkey-enhanced-migrate)').
Signed-off-by: Arnaud Ebalard <arno@natisbad.org>
Reported-by: Andreas Steffen <andreas.steffen@strongswan.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
UDP packets received in udpv6_recvmsg() are not only IPv6 UDP packets, but
also have IPv4 UDP packets, so when do the counter of UDP_MIB_INERRORS in
udpv6_recvmsg(), we should check whether the packet is a IPv6 UDP packet
or a IPv4 UDP packet.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If UDP echo is sent to xinetd/echo-dgram, the UDP reply will be received
at the sender. But the SNMP counter of UDP_MIB_INDATAGRAMS will be not
increased, UDP6_MIB_INDATAGRAMS will be increased instead.
Endpoint A Endpoint B
UDP Echo request ----------->
(IPv4, Dst port=7)
<---------- UDP Echo Reply
(IPv4, Src port=7)
This bug is come from this patch cb75994ec3.
It do counter UDP[6]_MIB_INDATAGRAMS until udp[v6]_recvmsg. Because
xinetd used IPv6 socket to receive UDP messages, thus, when received
UDP packet, the UDP6_MIB_INDATAGRAMS will be increased in function
udpv6_recvmsg() even if the packet is a IPv4 UDP packet.
This patch fixed the problem.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
af_unix: netns: fix problem of return value
IRDA: remove double inclusion of module.h
udp: multicast packets need to check namespace
net: add documentation for skb recycling
key: fix setkey(8) policy set breakage
bpa10x: free sk_buff with kfree_skb
xfrm: do not leak ESRCH to user space
net: Really remove all of LOOPBACK_TSO code.
netfilter: nf_conntrack_proto_gre: switch to register_pernet_gen_subsys()
netns: add register_pernet_gen_subsys/unregister_pernet_gen_subsys
net: delete excess kernel-doc notation
pppoe: Fix socket leak.
gianfar: Don't reset TBI<->SerDes link if it's already up
gianfar: Fix race in TBI/SerDes configuration
at91_ether: request/free GPIO for PHY interrupt
amd8111e: fix dma_free_coherent context
atl1: fix vlan tag regression
SMC91x: delete unused local variable "lp"
myri10ge: fix stop/go mmio ordering
bonding: fix panic when taking bond interface down before removing module
...
fix problem of return value
net/unix/af_unix.c: unix_net_init()
when error appears, it should return 'error', not always return 0.
Signed-off-by: Jianjun Kong <jianjun@zeuux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current UDP multicast delivery is not namespace aware.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 04a4bb55bc ("net: add
skb_recycle_check() to enable netdriver skb recycling") added a
method for network drivers to recycle skbuffs, but while use of
this mechanism was documented in the commit message, it should
really have been added as a docbook comment as well -- this
patch does that.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As it is, all instances of ->release() for files that have ->fasync()
need to remember to evict file from fasync lists; forgetting that
creates a hole and we actually have a bunch that *does* forget.
So let's keep our lives simple - let __fput() check FASYNC in
file->f_flags and call ->fasync() there if it's been set. And lose that
crap in ->release() instances - leaving it there is still valid, but we
don't have to bother anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I noticed that, under certain conditions, ESRCH can be leaked from the
xfrm layer to user space through sys_connect. In particular, this seems
to happen reliably when the kernel fails to resolve a template either
because the AF_KEY receive buffer being used by racoon is full or
because the SA entry we are trying to use is in XFRM_STATE_EXPIRED
state.
However, since this could be a transient issue it could be argued that
EAGAIN would be more appropriate. Besides this error code is not even
documented in the man page for sys_connect (as of man-pages 3.07).
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
register_pernet_gen_device() can't be used is nf_conntrack_pptp module is
also used (compiled in or loaded).
Right now, proto_gre_net_exit() is called before nf_conntrack_pptp_net_exit().
The former shutdowns and frees GRE piece of netns, however the latter
absolutely needs it to flush keymap. Oops is inevitable.
Switch to shiny new register_pernet_gen_subsys() to get correct ordering in
netns ops list.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
netns ops which are registered with register_pernet_gen_device() are
shutdown strictly before those which are registered with
register_pernet_subsys(). Sometimes this leads to opposite (read: buggy)
shutdown ordering between two modules.
Add register_pernet_gen_subsys()/unregister_pernet_gen_subsys() for modules
which aren't elite enough for entry in struct net, and which can't use
register_pernet_gen_device(). PPTP conntracking module is such one.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
SUNRPC: Fix potential race in put_rpccred()
SUNRPC: Fix rpcauth_prune_expired
NFS: Convert nfs_attr_generation_counter into an atomic_long
SUNRPC: Respond promptly to server TCP resets
Enable netlabel auditing functions only when CONFIG_AUDIT is set
Signed-off-by: Manish Katiyar <mkatiyar@gmail.com>
Signed-off-by: Paul Moore <paul.moore@hp.com>
Fix the compiler warnings below, thanks to Andrew Morton for finding them.
net/netlabel/netlabel_mgmt.c: In function `netlbl_mgmt_listentry':
net/netlabel/netlabel_mgmt.c:268: warning: 'ret_val' might be used
uninitialized in this function
Signed-off-by: Paul Moore <paul.moore@hp.com>
when testing the new pktgen module with multiple queues and ixgbe with:
pgset "flag QUEUE_MAP_CPU"
I found that I was getting errors in dmesg like:
pktgen: WARNING: QUEUE_MAP_CPU disabled because CPU count (8) exceeds number
<4>pktgen: WARNING: of tx queues (8) on eth15
you'll note, 8 really doesn't exceed 8.
This patch seemed to fix the logic errors and also the attempts at
limiting line length in printk (which didn't work anyway)
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have to be careful when we try to unhash the credential in
put_rpccred(), because we're not holding the credcache lock, so the call to
rpcauth_unhash_cred() may fail if someone else has looked the cred up, and
obtained a reference to it.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We need to make sure that we don't remove creds from the cred_unused list
if they are still under the moratorium, or else they will never get
garbage collected.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
If the server sends us an RST error while we're in the TCP_ESTABLISHED
state, then that will not result in a state change, and so the RPC client
ends up hanging forever (see
http://bugzilla.kernel.org/show_bug.cgi?id=11154)
We can intercept the reset by setting up an sk->sk_error_report callback,
which will then allow us to initiate a proper shutdown and retry...
We also make sure that if the send request receives an ECONNRESET, then we
shutdown too...
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Initialise correctly last fields, so tasks can be actually executed.
On some architectures the initial jiffies value is not zero, so later
all rfkill incorrectly decides that rfkill_*.last is in future.
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
David Miller noticed that commit
33ad798c92 '(tcp: options clean up')
did not move the req->cookie_ts check.
This essentially disabled commit 4dfc281702
'[Syncookies]: Add support for TCP options via timestamps.'.
This restores the original logic.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes a potential error packet loop.
Signed-off-by: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The default for the regulatory compatibility option is wrong;
if you picked the default you ended up with a non-functional wifi
system (at least I did on Fedora 9 with iwl4965).
I don't think even the October 2008 releases of the various distros
has the new userland so clearly the default is wrong, and also
we can't just go about deleting this in 2.6.29...
Change the default to "y" and also adjust the config text a little to
reflect this.
This patch fixes regression #11859
With thanks to Johannes Berg for the diagnostics
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
tcp: Restore ordering of TCP options for the sake of inter-operability
net: Fix disjunct computation of netdev features
sctp: Fix to handle SHUTDOWN in SHUTDOWN_RECEIVED state
sctp: Fix to handle SHUTDOWN in SHUTDOWN-PENDING state
sctp: Add check for the TSN field of the SHUTDOWN chunk
sctp: Drop ICMP packet too big message with MTU larger than current PMTU
p54: enable 2.4/5GHz spectrum by eeprom bits.
orinoco: reduce stack usage in firmware download path
ath5k: fix suspend-related oops on rmmod
[netdrvr] fec_mpc52xx: Implement polling, to make netconsole work.
qlge: Fix MSI/legacy single interrupt bug.
smc911x: Make the driver safer on SMP
smc911x: Add IRQ polarity configuration
smc911x: Allow Kconfig dependency on ARM
sis190: add identifier for Atheros AR8021 PHY
8139x: reduce message severity on driver overlap
igb: add IGB_DCA instead of selecting INTEL_IOATDMA
igb: fix tx data corruption with transition to L0s on 82575
ehea: Fix memory hotplug support
netdev: DM9000: remove BLACKFIN hacking in DM9000 netdev driver
...
This is not our bug! Sadly some devices cannot cope with the change
of TCP option ordering which was a result of the recent rewrite of
the option code (not that there was some particular reason steming
from the rewrite for the reordering) though any ordering of TCP
options is perfectly legal. Thus we restore the original ordering
to allow interoperability with/through such broken devices and add
some warning about this trap. Since the reordering just happened
without any particular reason, this change shouldn't cost us
anything.
There are already couple of known failure reports (within close
proximity of the last release), so the problem might be more
wide-spread than a single device. And other reports which may
be due to the same problem though the symptoms were less obvious.
Analysis of one of the case revealed (with very high probability)
that sack capability cannot be negotiated as the first option
(SYN never got a response).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Reported-by: Aldo Maggi <sentiniate@tiscali.it>
Tested-by: Aldo Maggi <sentiniate@tiscali.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'v28-range-hrtimers-for-linus-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits)
hrtimers: add missing docbook comments to struct hrtimer
hrtimers: simplify hrtimer_peek_ahead_timers()
hrtimers: fix docbook comments
DECLARE_PER_CPU needs linux/percpu.h
hrtimers: fix typo
rangetimers: fix the bug reported by Ingo for real
rangetimer: fix BUG_ON reported by Ingo
rangetimer: fix x86 build failure for the !HRTIMERS case
select: fix alpha OSF wrapper
select: fix alpha OSF wrapper
hrtimer: peek at the timer queue just before going idle
hrtimer: make the futex() system call use the per process slack value
hrtimer: make the nanosleep() syscall use the per process slack
hrtimer: fix signed/unsigned bug in slack estimator
hrtimer: show the timer ranges in /proc/timer_list
hrtimer: incorporate feedback from Peter Zijlstra
hrtimer: add a hrtimer_start_range() function
hrtimer: another build fix
hrtimer: fix build bug found by Ingo
hrtimer: make select() and poll() use the hrtimer range feature
...
My change
commit e2a6b85247
net: Enable TSO if supported by at least one device
didn't do what was intended because the netdev_compute_features
function was designed for conjunctions. So what happened was that
it would simply take the TSO status of the last constituent device.
This patch extends it to support both conjunctions and disjunctions
under the new name of netdev_increment_features.
It also adds a new function netdev_fix_features which does the
sanity checking that usually occurs upon registration. This ensures
that the computation doesn't result in an illegal combination
since this checking is absent when the change is initiated via
ethtool.
The two users of netdev_compute_features have been converted.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Once an endpoint has reached the SHUTDOWN-RECEIVED state,
it MUST NOT send a SHUTDOWN in response to a ULP request.
The Cumulative TSN Ack of the received SHUTDOWN chunk
MUST be processed.
This patch fix to process Cumulative TSN Ack of the received
SHUTDOWN chunk in SHUTDOWN_RECEIVED state.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If SHUTDOWN is received in SHUTDOWN-PENDING state, enpoint should enter
the SHUTDOWN-RECEIVED state and check the Cumulative TSN Ack field of
the SHUTDOWN chunk (RFC 4960 Section 9.2). If the SHUTDOWN chunk can
acknowledge all of the send DATA chunks, SHUTDOWN-ACK should be sent.
But now endpoint just silently discarded the SHUTDOWN chunk.
SHUTDOWN received in SHUTDOWN-PENDING state can happend when the last
SACK is lost by network, or the SHUTDOWN chunk can acknowledge all of
the received DATA chunks. The packet sequence(SACK lost) is like this:
Endpoint A Endpoint B ULP
(ESTABLISHED) (ESTABLISHED)
<----------- DATA
<--- shutdown
Enter SHUTDOWN-PENDING state
SACK ----lost---->
SHUTDOWN(*1) ------------>
<----------- SHUTDOWN-ACK
(*1) silently discarded now.
This patch fix to handle SHUTDOWN in SHUTDOWN-PENDING state as the same
as ESTABLISHED state.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If SHUTDOWN chunk is received Cumulative TSN Ack beyond the max tsn currently
send, SHUTDOWN chunk be accepted and the association will be broken. New data
is send, but after received SACK it will be drop because TSN in SACK is less
than the Cumulative TSN, data will be retrans again and again even if correct
SACK is received.
The packet sequence is like this:
Endpoint A Endpoint B ULP
(ESTABLISHED) (ESTABLISHED)
<----------- DATA (TSN=x-1)
<----------- DATA (TSN=x)
SHUTDOWN -----------> (Now Cumulative TSN=x+1000)
(TSN=x+1000)
<----------- DATA (TSN=x+1)
SACK -----------> drop the SACK
(TSN=x+1)
<----------- DATA (TSN=x+1)(retrans)
This patch fix this problem by terminating the association and respond to
the sender with an ABORT.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If ICMP packet too big message is received with MTU larger than current
PMTU, SCTP will still accept this ICMP message and sync the PMTU of assoc
with the wrong MTU.
Endpoing A Endpoint B
(ESTABLISHED) (ESTABLISHED)
ICMP --------->
(packet too big, MTU too larger)
sync PMTU
This patch fixed the problem by drop that ICMP message.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several sparse warnings were introduced by patches accepted during the merge
window which weren't caught. This patch fixes those warnings.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
This patch implements the RDMA transport provider for 9P. It allows
mounts to be performed over iWARP and IB capable network interfaces.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Signed-off-by: Latchesar Ionkov <lionkov@lanl.gov>
Fixes build problem with 9p when building with debug disabled.
Also contains some fixes for warnings which pop up when
CONFIG_NET_9P_DEBUG is disabled.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>