This replaces all instances of lock_kernel in the
IPX code with lock_sock. As far as I can tell, this
is safe to do, because there is no global state
that needs to be locked in IPX, and the code does
not recursively take the lock or sleep indefinitely
while holding it.
Compile-tested only.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: netdev@vger.kernel.org
This changes appletalk to use lock_sock instead of
lock_kernel for serialization. I tried to make sure
that we don't hold the socket lock during sleeping
functions, but I did not try to prove whether the
locks are necessary in the first place.
Compile-tested only.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
This replaces all instances of lock_kernel in x25
with lock_sock, taking care to release the socket
lock around sleeping functions (sock_alloc_send_skb
and skb_recv_datagram). It is not clear whether
this is a correct solution, but it seem to be what
other protocols do in the same situation.
Includes a fix suggested by Eric Dumazet.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David S. Miller <davem@davemloft.net>
Tested-by: Andrew Hendry <andrew.hendry@gmail.com>
Cc: linux-x25@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Eric Dumazet <eric.dumazet@gmail.com>
The only necessary parts are the src/dst addresses, the
interface indexes, the TOS, and the mark.
The rest is unnecessary bloat, which amounts to nearly
50 bytes on 64-bit.
Signed-off-by: David S. Miller <davem@davemloft.net>
rt->rt_iif is only ever inspected on input routes, for example DCCP
uses this to populate a route lookup flow key when generating replies
to another packet.
Therefore, setting it to anything other than zero on output routes
makes no sense.
Signed-off-by: David S. Miller <davem@davemloft.net>
We burn a lot of useless cycles, cpu store buffer traffic, and
memory operations memset()'ing the on-stack flow used to perform
output route lookups in __ip_route_output_key().
Only the first half of the flow object members even matter for
output route lookups in this context, specifically:
FIB rules matching cares about:
dst, src, tos, iif, oif, mark
FIB trie lookup cares about:
dst
FIB semantic match cares about:
tos, scope, oif
Therefore only initialize these specific members and elide the
memset entirely.
On Niagara2 this kills about ~300 cycles from the output route
lookup path.
Likely, we can take things further, since all callers of output
route lookups essentially throw away the on-stack flow they use.
So they don't care if we use it as a scratch-pad to compute the
final flow key.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
David noticed :
------------------
Eric, I was profiling the non-routing-cache case and something that
stuck out is the case of calling inet_getpeer() with create==0.
If an entry is not found, we have to redo the lookup under a spinlock
to make certain that a concurrent writer rebalancing the tree does
not "hide" an existing entry from us.
This makes the case of a create==0 lookup for a not-present entry
really expensive. It is on the order of 600 cpu cycles on my
Niagara2.
I added a hack to not do the relookup under the lock when create==0
and it now costs less than 300 cycles.
This is now a pretty common operation with the way we handle COW'd
metrics, so I think it's definitely worth optimizing.
-----------------
One solution is to use a seqlock instead of a spinlock to protect struct
inet_peer_base.
After a failed avl tree lookup, we can easily detect if a writer did
some changes during our lookup. Taking the lock and redo the lookup is
only necessary in this case.
Note: Add one private rcu_deref_locked() macro to place in one spot the
access to spinlock included in seqlock.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The standby logic used to be pretty dependent on the work requeueing
behavior that changed when we switched to WQ_NON_REENTRANT. It was also
very fragile.
Restructure things so that:
- We clear WRITE_PENDING when we set STANDBY. This ensures we will
requeue work when we wake up later.
- con_work backs off if STANDBY is set. There is nothing to do if we are
in standby.
- clear_standby() helper is called by both con_send() and con_keepalive(),
the two actions that can wake us up again. Move the connect_seq++
logic here.
Signed-off-by: Sage Weil <sage@newdream.net>
With commit f363e45f we replaced a bunch of hacky workqueue mutual
exclusion logic with the WQ_NON_REENTRANT flag. One pieces of fallout is
that the exponential backoff breaks in certain cases:
* con_work attempts to connect.
* we get an immediate failure, and the socket state change handler queues
immediate work.
* con_work calls con_fault, we decide to back off, but can't queue delayed
work.
In this case, we add a BACKOFF bit to make con_work reschedule delayed work
next time it runs (which should be immediately).
Signed-off-by: Sage Weil <sage@newdream.net>
mac80211 does the same afterwards anyway. Hence, just drop
this redundant code.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
DNS: Fix a NULL pointer deref when trying to read an error key [CVE-2011-1076]
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
MAINTAINERS: Add Andy Gospodarek as co-maintainer.
r8169: disable ASPM
RxRPC: Fix v1 keys
AF_RXRPC: Handle receiving ACKALL packets
cnic: Fix lost interrupt on bnx2x
cnic: Prevent status block race conditions with hardware
net: dcbnl: check correct ops in dcbnl_ieee_set()
e1000e: disable broken PHY wakeup for ICH10 LOMs, use MAC wakeup instead
igb: fix sparse warning
e1000: fix sparse warning
netfilter: nf_log: avoid oops in (un)bind with invalid nfproto values
dccp: fix oops on Reset after close
ipvs: fix dst_lock locking on dest update
davinci_emac: Add Carrier Link OK check in Davinci RX Handler
bnx2x: update driver version to 1.62.00-6
bnx2x: properly calculate lro_mss
bnx2x: perform statistics "action" before state transition.
bnx2x: properly configure coefficients for MinBW algorithm (NPAR mode).
bnx2x: Fix ethtool -t link test for MF (non-pmf) devices.
bnx2x: Fix nvram test for single port devices.
...
When a DNS resolver key is instantiated with an error indication, attempts to
read that key will result in an oops because user_read() is expecting there to
be a payload - and there isn't one [CVE-2011-1076].
Give the DNS resolver key its own read handler that returns the error cached in
key->type_data.x[0] as an error rather than crashing.
Also make the kenter() at the beginning of dns_resolver_instantiate() limit the
amount of data it prints, since the data is not necessarily NUL-terminated.
The buggy code was added in:
commit 4a2d789267
Author: Wang Lei <wang840925@gmail.com>
Date: Wed Aug 11 09:37:58 2010 +0100
Subject: DNS: If the DNS server returns an error, allow that to be cached [ver #2]
This can trivially be reproduced by any user with the following program
compiled with -lkeyutils:
#include <stdlib.h>
#include <keyutils.h>
#include <err.h>
static char payload[] = "#dnserror=6";
int main()
{
key_serial_t key;
key = add_key("dns_resolver", "a", payload, sizeof(payload),
KEY_SPEC_SESSION_KEYRING);
if (key == -1)
err(1, "add_key");
if (keyctl_read(key, NULL, 0) == -1)
err(1, "read_key");
return 0;
}
What should happen is that keyctl_read() reports error 6 (ENXIO) to the user:
dns-break: read_key: No such device or address
but instead the kernel oopses.
This cannot be reproduced with the 'keyutils add' or 'keyutils padd' commands
as both of those cut the data down below the NUL termination that must be
included in the data. Without this dns_resolver_instantiate() will return
-EINVAL and the key will not be instantiated such that it can be read.
The oops looks like:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffff811b99f7>] user_read+0x4f/0x8f
PGD 3bdf8067 PUD 385b9067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/irq
CPU 0
Modules linked in:
Pid: 2150, comm: dns-break Not tainted 2.6.38-rc7-cachefs+ #468 /DG965RY
RIP: 0010:[<ffffffff811b99f7>] [<ffffffff811b99f7>] user_read+0x4f/0x8f
RSP: 0018:ffff88003bf47f08 EFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff88003b5ea378 RCX: ffffffff81972368
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003b5ea378
RBP: ffff88003bf47f28 R08: ffff88003be56620 R09: 0000000000000000
R10: 0000000000000395 R11: 0000000000000002 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffffffffa1
FS: 00007feab5751700(0000) GS:ffff88003e000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000003de40000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process dns-break (pid: 2150, threadinfo ffff88003bf46000, task ffff88003be56090)
Stack:
ffff88003b5ea378 ffff88003b5ea3a0 0000000000000000 0000000000000000
ffff88003bf47f68 ffffffff811b708e ffff88003c442bc8 0000000000000000
00000000004005a0 00007fffba368060 0000000000000000 0000000000000000
Call Trace:
[<ffffffff811b708e>] keyctl_read_key+0xac/0xcf
[<ffffffff811b7c07>] sys_keyctl+0x75/0xb6
[<ffffffff81001f7b>] system_call_fastpath+0x16/0x1b
Code: 75 1f 48 83 7b 28 00 75 18 c6 05 58 2b fb 00 01 be bb 00 00 00 48 c7 c7 76 1c 75 81 e8 13 c2 e9 ff 4c 8b b3 e0 00 00 00 4d 85 ed <41> 0f b7 5e 10 74 2d 4d 85 e4 74 28 e8 98 79 ee ff 49 39 dd 48
RIP [<ffffffff811b99f7>] user_read+0x4f/0x8f
RSP <ffff88003bf47f08>
CR2: 0000000000000010
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
cc: Wang Lei <wang840925@gmail.com>
Signed-off-by: James Morris <jmorris@namei.org>
If we mark the connection CLOSED we will give up trying to reconnect to
this server instance. That is appropriate for things like a protocol
version mismatch that won't change until the server is restarted, at which
point we'll get a new addr and reconnect. An authorization failure like
this is probably due to the server not properly rotating it's secret keys,
however, and should be treated as transient so that the normal backoff and
retry behavior kicks in.
Signed-off-by: Sage Weil <sage@newdream.net>
get_user_pages() can return fewer pages than we ask for. We were returning
a bogus pointer/error code in that case. Instead, loop until we get all
the pages we want or get an error we can return to the caller.
Signed-off-by: Sage Weil <sage@newdream.net>
Netlink message processing in the kernel is synchronous these days,
capabilities can be checked directly in security_netlink_recv() from
the current process.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Reviewed-by: James Morris <jmorris@namei.org>
[chrisw: update to include pohmelfs and uvesafb]
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because of various alignements [SLUB / qdisc], we use 512 bytes of
memory for one {p|b}fifo qdisc, instead of 256 bytes on 64bit arches and
192 bytes on 32bit ones.
Move the "u32 limit" inside "struct Qdisc" (no impact on other qdiscs)
Change qdisc_alloc(), first trying a regular allocation before an
oversized one.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netlink message processing in the kernel is synchronous these days, the
session information can be collected when needed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Eric:
[11483.697233] IP: [<c12b0638>] dst_release+0x18/0x60
...
[11483.697741] Call Trace:
[11483.697764] [<c12fc9d2>] udp_sendmsg+0x282/0x6e0
[11483.697790] [<c12a1c01>] ? memcpy_toiovec+0x51/0x70
[11483.697818] [<c12dbd90>] ? ip_generic_getfrag+0x0/0xb0
The pointer passed to dst_release() is -EINVAL, that's because
we leave an error pointer in the local variable "rt" by accident.
NULL it out to fix the bug.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The OpenAFS server is now sending ACKALL packets, so we need to handle them.
Otherwise we report a protocol error and abort.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the support for retrieving the remote or peer DCBX
configuration via dcbnl for embedded DCBX stacks supporting the CEE DCBX
standard.
Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These 2 patches add the support for retrieving the remote or peer DCBX
configuration via dcbnl for embedded DCBX stacks. The peer configuration
is part of the DCBX MIB and is useful for debugging and diagnostics of
the overall DCB configuration. The first patch add this support for IEEE
802.1Qaz standard the second patch add the same support for the older
CEE standard. Diff for v2 - the peer-app-info is CEE specific.
Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The incorrect ops routine was being tested for in
DCB_ATTR_IEEE_PFC attributes. This patch corrects
it.
Currently, every driver implementing ieee_setets also
implements ieee_setpfc so this bug is not actualized
yet.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Like many other places, we have to check that the array index is
within allowed limits, or otherwise, a kernel oops and other nastiness
can ensue when we access memory beyond the end of the array.
[ 5954.115381] BUG: unable to handle kernel paging request at 0000004000000000
[ 5954.120014] IP: __find_logger+0x6f/0xa0
[ 5954.123979] nf_log_bind_pf+0x2b/0x70
[ 5954.123979] nfulnl_recv_config+0xc0/0x4a0 [nfnetlink_log]
[ 5954.123979] nfnetlink_rcv_msg+0x12c/0x1b0 [nfnetlink]
...
The problem goes back to v2.6.30-rc1~1372~1342~31 where nf_log_bind
was decoupled from nf_log_register.
Reported-by: Miguel Di Ciurcio Filho <miguel.filho@gmail.com>,
via irc.freenode.net/#netfilter
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This fixes a bug in the order of dccp_rcv_state_process() that still permitted
reception even after closing the socket. A Reset after close thus causes a NULL
pointer dereference by not preventing operations on an already torn-down socket.
dccp_v4_do_rcv()
|
| state other than OPEN
v
dccp_rcv_state_process()
|
| DCCP_PKT_RESET
v
dccp_rcv_reset()
|
v
dccp_time_wait()
WARNING: at net/ipv4/inet_timewait_sock.c:141 __inet_twsk_hashdance+0x48/0x128()
Modules linked in: arc4 ecb carl9170 rt2870sta(C) mac80211 r8712u(C) crc_ccitt ah
[<c0038850>] (unwind_backtrace+0x0/0xec) from [<c0055364>] (warn_slowpath_common)
[<c0055364>] (warn_slowpath_common+0x4c/0x64) from [<c0055398>] (warn_slowpath_n)
[<c0055398>] (warn_slowpath_null+0x1c/0x24) from [<c02b72d0>] (__inet_twsk_hashd)
[<c02b72d0>] (__inet_twsk_hashdance+0x48/0x128) from [<c031caa0>] (dccp_time_wai)
[<c031caa0>] (dccp_time_wait+0x40/0xc8) from [<c031c15c>] (dccp_rcv_state_proces)
[<c031c15c>] (dccp_rcv_state_process+0x120/0x538) from [<c032609c>] (dccp_v4_do_)
[<c032609c>] (dccp_v4_do_rcv+0x11c/0x14c) from [<c0286594>] (release_sock+0xac/0)
[<c0286594>] (release_sock+0xac/0x110) from [<c031fd34>] (dccp_close+0x28c/0x380)
[<c031fd34>] (dccp_close+0x28c/0x380) from [<c02d9a78>] (inet_release+0x64/0x70)
The fix is by testing the socket state first. Receiving a packet in Closed state
now also produces the required "No connection" Reset reply of RFC 4340, 8.3.1.
Reported-and-tested-by: Johan Hovold <jhovold@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch to replace inet->cork with cork left out two spots in
__ip_append_data that can result in bogus packet construction.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_NET_KEY_MIGRATE is not defined the arguments of
pfkey_migrate stub do not match causing warning.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The route lookup code in icmpv6_send() is slightly tricky as a result of
having to handle all of the requirements of RFC 4301 host relookups.
Pull the route resolution into a seperate function, so that the error
handling and route reference counting is hopefully easier to see and
contained wholly within this new routine.
Signed-off-by: David S. Miller <davem@davemloft.net>
Command pointer was a leftover after moving controller index to
mgmt_hdr.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
It is now possible to create command complete event without specific
reply data by passing NULL as reply with len 0. Check pointer before
calling memcpy to avoid undefined behaviour.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
The route lookup code in icmp_send() is slightly tricky as a result of
having to handle all of the requirements of RFC 4301 host relookups.
Pull the route resolution into a seperate function, so that the error
handling and route reference counting is hopefully easier to see and
contained wholly within this new routine.
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix dst_lock usage in __ip_vs_update_dest. We need
_bh locking because destination is updated in user context.
Can cause lockups on frequent destination updates.
Problem reported by Simon Kirby. Bug was introduced
in 2.6.37 from the "ipvs: changes for local real server"
change.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Return a dst pointer which is potentitally error encoded.
Don't pass original dst pointer by reference, pass a struct net
instead of a socket, and elide the flow argument since it is
unnecessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
Route lookups follow a general pattern in the ipv6 code wherein
we first find the non-IPSEC route, potentially override the
flow destination address due to ipv6 options settings, and then
finally make an IPSEC search using either xfrm_lookup() or
__xfrm_lookup().
__xfrm_lookup() is used when we want to generate a blackhole route
if the key manager needs to resolve the IPSEC rules (in this case
-EREMOTE is returned and the original 'dst' is left unchanged).
Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC
resolution is necessary, we simply fail the lookup completely.
All of these cases are encapsulated into two routines,
ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow. The latter of which
handles unconnected UDP datagram sockets.
Signed-off-by: David S. Miller <davem@davemloft.net>
The UDP transmit path has been running under the socket lock
for a long time because of the corking feature. This means that
transmitting to the same socket in multiple threads does not
scale at all.
However, as most users don't actually use corking, the locking
can be removed in the common case.
This patch creates a lockless fast path where corking is not used.
Please note that this does create a slight inaccuracy in the
enforcement of socket send buffer limits. In particular, we
may exceed the socket limit by up to (number of CPUs) * (packet
size) because of the way the limit is computed.
As the primary purpose of socket buffers is to indicate congestion,
this should not be a great problem for now.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch converts UDP to use the new ip_finish_skb API. This
would then allows us to more easily use ip_make_skb which allows
UDP to run without a socket lock.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the helper ip_make_skb which is like ip_append_data
and ip_push_pending_frames all rolled into one, except that it does
not send the skb produced. The sending part is carried out by
ip_send_skb, which the transport protocol can call after it has
tweaked the skb.
It is meant to be called in cases where corking is not used should
have a one-to-one correspondence to sendmsg.
This patch also adds the helper ip_finish_skb which is meant to
be replace ip_push_pending_frames when corking is required.
Previously the protocol stack would peek at the socket write
queue and add its header to the first packet. With ip_finish_skb,
the protocol stack can directly operate on the final skb instead,
just like the non-corking case with ip_make_skb.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to allow simultaneous calls to ip_append_data on the same
socket, it must not modify any shared state in sk or inet (other
than those that are designed to allow that such as atomic counters).
This patch abstracts out write references to sk and inet_sk in
ip_append_data and its friends so that we may use the underlying
code in parallel.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
UFO doesn't really use the sk_sndmsg_* parameters so touching
them is pointless. It can't use them anyway since the whole
point of UFO is to use the original pages without copying.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
... Otherwise it is displayed when mac80211 isn't
even turned on, which is completely pointless.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Also fix a typo in the STATION_INFO_TX_BITRATE description
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Enabling TX timestamps (SO_TIMESTAMPING) for IPv6 UDP packets, in
the same fashion as for IPv4. Necessary in order for NICs such as
Intel 82580 to timestamp IPv6 packets.
Signed-off-by: Anders Berggren <anders@halon.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
netlink_dump() may failed, but nobody handle its error.
It generates output data, when a previous portion has been returned to
user space. This mechanism works when all data isn't go in skb. If we
enter in netlink_recvmsg() and skb is absent in the recv queue, the
netlink_dump() will not been executed. So if netlink_dump() is failed
one time, the new data never appear and the reader will sleep forever.
netlink_dump() is called from two places:
1. from netlink_sendmsg->...->netlink_dump_start().
In this place we can report error directly and it will be returned
by sendmsg().
2. from netlink_recvmsg
There we can't report error directly, because we have a portion of
valid output data and call netlink_dump() for prepare the next portion.
If netlink_dump() is failed, the socket will be mark as error and the
next recvmsg will be failed.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we want something "bool" built-in in something "tristate" it can't
"depend on" the tristate config option.
Report by DaveM:
I give it 'y' just to make it happen, for both, and afterways no
matter how many times I rerun "make oldconfig" I keep seeing things
like this in my build:
scripts/kconfig/conf --silentoldconfig Kconfig
include/config/auto.conf:986:warning: symbol value 'm' invalid for BT_SCO
include/config/auto.conf:3156:warning: symbol value 'm' invalid for BT_L2CAP
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch fixes the out of sync scenarios while in SYN_RECV state.
Quoting Jozsef, what it happens if we are out of sync if the
following:
> > b. conntrack entry is outdated, new SYN received
> > - (b1) we ignore it but save the initialization data from it
> > - (b2) when the reply SYN/ACK receives and it matches the saved data,
> > we pick up the new connection
This is what it should happen if we are in SYN_RECV state. Initially,
the SYN packet hits b1, thus we save data from it. But the SYN/ACK
packet is considered a retransmission given that we're in SYN_RECV
state. Therefore, we never hit b2 and we don't get in sync. To fix
this, we ignore SYN/ACK if we are in SYN_RECV. If the previous packet
was a SYN, then we enter the ignore case that get us in sync.
This patch helps a lot to conntrackd in stress scenarios (assumming a
client that generates lots of small TCP connections). During the failover,
consider that the new primary has injected one outdated flow in SYN_RECV
state (this is likely to happen if the conntrack event rate is high
because the backup will be a bit delayed from the primary). With the
current code, if the client starts a new fresh connection that matches
the tuple, the SYN packet will be ignored without updating the state
tracking, and the SYN+ACK in reply will blocked as it will not pass
checkings III or IV (since all state tracking in the original direction
is not initialized because of the SYN packet was ignored and the ignore
case that get us in sync is not applied).
I posted a couple of patches before this one. Changli Gao spotted
a simpler way to fix this problem. This patch implements his idea.
Cc: Changli Gao <xiaosuo@gmail.com>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Neil pointed out that we can't send ARP reply on behalf of slaves,
we need to move the arp queue to their bond device.
Signed-off-by: WANG Cong <amwang@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
V4: rebase to net-next-2.6
This patch removes the flag IFF_IN_NETPOLL, we don't need it any more since
we have netpoll_tx_running() now.
Signed-off-by: WANG Cong <amwang@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use ERR_PTR mechanism to return error from hci_connect.
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Anderson Briglia <anderson.briglia@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Crafted (too small) data buffer could result in reading data outside of buffer.
Validate buffer size and return EINVAL if size is wrong.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Acked-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Most mgmt commands and event are related to hci adapter. Moving index to
common header allow to easily use it in command status while reporting errors.
For those not related to adapter use MGMT_INDEX_NONE (0xFFFF) as index.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Acked-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
The structure used for command was wrong (probably copy-paste mistake).
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Acked-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This actually pointed out a (seemingly known) bug where we mangle the
pfkey header in a potentially shared SKB, which is fixed here.
Signed-off-by: David S. Miller <davem@davemloft.net>
hci_sock_cleanup is already called after the sock_err label.
It appears that we can drop this call.
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
rtnl_unicast() return value is not of interest, we can silently ignore
it, save some instructions and four byte on the stack.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
hash is declared and assigned but not used anymore. ipv6_addr_hash()
exhibit no side-effects.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Declaration and assignment of newdp is removed. Usage of dccp_sk()
exhibit no side effects.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
addr_type of 0 means that the type should be adopted from from_dev and
not from __hw_addr_del_multiple(). Unfortunately it isn't so and
addr_type will always be considered. Fix this by implementing the
considered and documented behavior.
Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
For devices supported by iwlwifi sometimes
off-channel transmissions need to be handled
by the device completely. To support this
mac80211 needs to pass the frame directly
to the driver and not through the TX path
as the driver needs the frame and channel
information at the same time.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Is still possible to schedule conn_mon_timer after disassociate from
ieee80211_sta_tx_notify() and ieee80211_offchannel_ps_disable().
Move disassociate check to ieee80211_sta_reset_conn_monitor() to cover
all these cases, and add unlikely since in most the time we call
ieee80211_sta_reset_conn_monitor() when associated.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We need to copy this to allow drivers to look
at the information where needed.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This reverts 4a332a38
("mac80211: Give it some time to do the TSF sync").
There's no point in waiting with a new IBSS merge
just because the hardware hasn't merged up with
the old IBSS yet, and since 34e8f082 we no longer
attempt to merge with the IBSS we're already in.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The return value of the tx operation is commonly
misused by drivers, leading to errors. All drivers
will drop frames if they fail to TX the frame, and
they must also properly manage the queues (if they
didn't, mac80211 would already warn).
Removing the ability for drivers to return a BUSY
value also allows significant cleanups of the TX
TX handling code in mac80211.
Note that this also fixes a bug in ath9k_htc, the
old "return -1" there was wrong.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Tested-by: Sedat Dilek <sedat.dilek@googlemail.com> [ath5k]
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com> [rt2x00]
Acked-by: Larry Finger <Larry.Finger@lwfinger.net> [b43, rtl8187, rtlwifi]
Acked-by: Luciano Coelho <coelho@ti.com> [wl12xx]
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* Do not fail if the peer supports more or less than 3 algorithms.
* Ignore unknown congestion control algorithms instead of failing.
* Simplify congestion algorithm negotiation (largest is best).
* Do not use a static buffer.
* Fix off-by-two read overflow.
* Avoid extra memory copy (in addition to skb_copy_bits()).
The previous code really made no sense.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sk->sk_state already contains the pipe state.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With slab poisoning enabled, I see the following oops:
Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6b73
...
NIP [c0000000006bc61c] .rxrpc_destroy+0x44/0x104
LR [c0000000006bc618] .rxrpc_destroy+0x40/0x104
Call Trace:
[c0000000feb2bc00] [c0000000006bc618] .rxrpc_destroy+0x40/0x104 (unreliable)
[c0000000feb2bc90] [c000000000349b2c] .key_cleanup+0x1a8/0x20c
[c0000000feb2bd40] [c0000000000a2920] .process_one_work+0x2f4/0x4d0
[c0000000feb2be00] [c0000000000a2d50] .worker_thread+0x254/0x468
[c0000000feb2bec0] [c0000000000a868c] .kthread+0xbc/0xc8
[c0000000feb2bf90] [c000000000020e00] .kernel_thread+0x54/0x70
We aren't initialising token->next, but the code in destroy_context relies
on the list being NULL terminated. Use kzalloc to zero out all the fields.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Before this patch issuing these commands:
fd = open("/proc/sys/net/ipv6/route/flush")
unshare(CLONE_NEWNET)
write(fd, "stuff")
would flush the newly created net, not the original one.
The equivalent ipv4 code is correct (stores the net inside ->extra1).
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Better document choke skb->cb[] use, like we did in netem and sfb
This adds a compile time check to make sure we dont exhaust skb->cb[]
space.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of debug message that are not useful, and enable
the log messages in case of error.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a patch originated with Stefano Salsano and Fabio Ludovici.
It provides several alternative loss models for use with netem.
This patch adds two state machine based loss models.
See: http://netgroup.uniroma2.it/twiki/bin/view.cgi/Main/NetemCLG
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many users have wanted the old functionality that was lost
to be able to use pfifo as inner qdisc for netem. The reason that
netem could not be classful with the older API was because of the
limitations of the old dequeue/requeue interface; now that qdisc API has
a peek function, there is no longer a problem with using any
inner qdisc's.
This reverts commit 0220146411.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rather than magic constant in code, expose the maximum size of
packet distribution table in API. In iproute2, q_netem defines
MAX_DIST as 16K already.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netem probability table can be large (up to 64K bytes)
which may be too large to allocate in one contiguous chunk.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use nla_put_nested to update netlink attribute value.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
lc and wlc use the same formula, but lblc and lblcr use another one. There
is no reason for using two different formulas for the lc variants.
The formula used by lc is used by all the lc variants in this patch.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Wensong Zhang <wensong@linux-vs.org>
Signed-off-by: Simon Horman <horms@verge.net.au>
ip_route_newports() is the only place in the entire kernel that
cares about the port members in the routing cache entry's lookup
flow key.
Therefore the only reason we store an entire flow inside of the
struct rtentry is for this one special case.
Rewrite ip_route_newports() such that:
1) The caller passes in the original port values, so we don't need
to use the rth->fl.fl_ip_{s,d}port values to remember them.
2) The lookup flow is constructed by hand instead of being copied
from the routing cache entry's flow.
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
Added support for usb ethernet (0x0fe6, 0x9700)
r8169: fix RTL8168DP power off issue.
r8169: correct settings of rtl8102e.
r8169: fix incorrect args to oob notify.
DM9000B: Fix PHY power for network down/up
DM9000B: Fix reg_save after spin_lock in dm9000_timeout
net_sched: long word align struct qdisc_skb_cb data
sfc: lower stack usage in efx_ethtool_self_test
bridge: Use IPv6 link-local address for multicast listener queries
bridge: Fix MLD queries' ethernet source address
bridge: Allow mcast snooping for transient link local addresses too
ipv6: Add IPv6 multicast address flag defines
bridge: Add missing ntohs()s for MLDv2 report parsing
bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report
bridge: Fix IPv6 multicast snooping by storing correct protocol type
p54pci: update receive dma buffers before and after processing
fix cfg80211_wext_siwfreq lock ordering...
rt2x00: Fix WPA TKIP Michael MIC failures.
ath5k: Fix fast channel switching
tcp: undo_retrans counter fixes
...
Enhance TIPC to skip unnecessary (and, in some cases, redundant)
preparation work when sending a broadcast link NACK message, since this
preparation is only required for broadcast messages that are sent in a
reliable manner. This change also fixes a bug that caused NACK messages
to be improperly counted as "TX packets" in TIPC's broadcast link
statistics.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates support for the "number of requested links" field in a neighbor
discovery message. This field was never used and has been removed from
the TIPC 2.0 protocol specification.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates TIPC's prototype support for message sequence numbering
on routable connections (i.e. connections requiring more than one hop).
This capability isn't currently used, and can be removed since TIPC
only supports systems in which all inter-node communication can be
achieved in a single hop.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Ensure that the routine that starts up processing on a newly created
link endpoint takes the spinlock of the node object that owns the link,
to prevent possible conflicts with processing involving other links
owned by that node object.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Modifies TIPC's congestion control between a connected port and its
peer so that it works as documented. The following changes have been
made:
1) The counter of the number of messages sent by a port now starts
at zero, rather than one. This prevents the port from reporting port
congestion one message earlier than it was supposed to.
2) The counter of the number of messages sent by a port is now
incremented only if a non-empty message is sent successfully.
This prevents the port from becoming permanently congested if
too many send attempts are unsuccessful because of congestion
(or other reasons). It also removes the risk that empty hand-
shaking messages used during connection setup might cause the
port to report congestion earlier than it was supposed to.
3) The counter of the number of unacknowledged messages received by
a port controlled by an internal TIPC service is now incremented
only if the message is non-empty, in order to be consistent with
the aforementioned changes.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates a local iovec structure containing no data, which was
previously used during the establishment of a topology service connection,
since the same effect can be achieved by passing in a NULL pointer and
an iovec length of zero.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Ensures that a link reset or activate message has a "probe" field
of zero. (This field is currently unused in these messages, but this
could potentially change in future versions of TIPC.)
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Enhances TIPC's unicast and broadcast link code to update the transmit
queue maximum size counter in a single place, namely the routine that
adds messages to the queue. This ensures that the maximum size statistic
reported for unicast links is completely accurate, rather than being
partially based on statistical sampling.
The changes to link.h are just documenting the roles of the variables.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Modifies the initialization code for TIPC's topology service to
avoid taking the spinlock protecting the subscriber list, since
there is no need to do this.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Allows the broadcast link to track the node that is requesting a retransmit
in a new field dedicated to that purpose. This replaces the existing
mechanism that (ab)uses an existing node structure linked list field to do
the tracking.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Corrects print statements that use %x to print pointer values to use
%p instead, so that 64-bit pointer values are displayed correctly.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Enhances TIPC link code to ignore an invalid link tolerance value
contained in an incoming LINK_PROTOCOL message, rather than
processing the value and potentially causing a divide-by-zero error.
Also add a compile-time check that catches attempts to redefine
TIPC's minimum link tolerance value in a manner that might result
in the same divide-by-zero error at run-time.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Reject TIPC configuration service messages without a full message
header. Previously, an application that sent a message to the
configuration service that was too short could cause the validation
code to access an uninitialized field in the msghdr structure,
resulting in a memory access exception.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Eliminates a global variable that was previously used by TIPC's user
registry to track the number of distinct applications using TIPC. Due to
the recent elimination of the user registry this variable no longer serves
any purpose and can be removed.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Combines two distinct structures containing information about a TIPC bearer
into a single structure. The structures were previously kept separate so
that public information about a bearer could be made available to plug-in
media types using TIPC's native API, while the remaining information was
kept private for use by TIPC itself. However, now that the native API has
been removed there is no longer any need for this arrangement.
Since one of the structures was already embedded within the other, the
change largely involves replacing instances of "publ.foo" with "foo".
The changes do not otherwise alter the operation of TIPC bearers.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Merge two distinct structures containing information about a TIPC port
into a single structure. The structures were previously kept separate
so that public information about a port could be made available to
applications using TIPC's native API, while the remaining information
was kept private for use by TIPC itself. However, now that the native
API has been removed there is no longer any need for this somewhat
confusing arrangement.
Since one of the structures was already embedded within the other, the
change largely involves replacing instances of "publ.foo" with "foo".
The changes do not otherwise alter the operation of TIPC ports.
Signed-off-by: Allan Stephens <Allan.Stephens@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Use discrete setting ops for not updated drivers. This will not make
them conform to full G/SFEATURES semantics, though.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement getting rx checksum state for not updated drivers.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid "Features changed" message and ndo_set_features call on device
registration caused by automatic enabling of GSO and GRO. Driver should
have enabled hardware offloads it set in features, so the ndo_set_features()
is not needed at registration time.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix netdev_update_features() messages on register time by moving
the call further in register_netdevice(). When
netdev->reg_state != NETREG_REGISTERED, netdev_name() returns
"(unregistered netdevice)" even if the dev's name is already filled.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
* make qdisc_ops local
* add sparse annotation about expected unlock/unlock in dump_class_stats
* fix indentation
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use __force to quiet sparse warnings for cases where the code
is simulating user space pointers.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the Stochastic Fair Blue scheduler, based on work from :
W. Feng, D. Kandlur, D. Saha, K. Shin. Blue: A New Class of Active Queue
Management Algorithms. U. Michigan CSE-TR-387-99, April 1999.
http://www.thefengs.com/wuchang/blue/CSE-TR-387-99.pdf
This implementation is based on work done by Juliusz Chroboczek
General SFB algorithm can be found in figure 14, page 15:
B[l][n] : L x N array of bins (L levels, N bins per level)
enqueue()
Calculate hash function values h{0}, h{1}, .. h{L-1}
Update bins at each level
for i = 0 to L - 1
if (B[i][h{i}].qlen > bin_size)
B[i][h{i}].p_mark += p_increment;
else if (B[i][h{i}].qlen == 0)
B[i][h{i}].p_mark -= p_decrement;
p_min = min(B[0][h{0}].p_mark ... B[L-1][h{L-1}].p_mark);
if (p_min == 1.0)
ratelimit();
else
mark/drop with probabilty p_min;
I did the adaptation of Juliusz code to meet current kernel standards,
and various changes to address previous comments :
http://thread.gmane.org/gmane.linux.network/90225http://thread.gmane.org/gmane.linux.network/90375
Default flow classifier is the rxhash introduced by RPS in 2.6.35, but
we can use an external flow classifier if wanted.
tc qdisc add dev $DEV parent 1:11 handle 11: \
est 0.5sec 2sec sfb limit 128
tc filter add dev $DEV protocol ip parent 11: handle 3 \
flow hash keys dst divisor 1024
Notes:
1) SFB default child qdisc is pfifo_fast. It can be changed by another
qdisc but a child qdisc MUST not drop a packet previously queued. This
is because SFB needs to handle a dequeued packet in order to maintain
its virtual queue states. pfifo_head_drop or CHOKe should not be used.
2) ECN is enabled by default, unlike RED/CHOKe/GRED
With help from Patrick McHardy & Andi Kleen
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Juliusz Chroboczek <Juliusz.Chroboczek@pps.jussieu.fr>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Patrick McHardy <kaber@trash.net>
CC: Andi Kleen <andi@firstfloor.org>
CC: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The flag isn't very descriptive -- the intention
is that the driver provides a TSF timestamp at
the beginning of the MPDU -- make that clearer
by renaming the flag to RX_FLAG_MACTIME_MPDU.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There is a race on sending a data frame before the tx completion
of nullfunc frame for enabling power save. As the data quickly
follows the nullfunc frame, the AP thinks that the station is out
of power save and continues to send the frames. Whereas in the
station, the nullfunc ack will be processed after the tx completion
of data frame and mac80211 goes to powersave. Thus the power
save state mismatch between the station and the AP causes some
data loss and some applications fail because of that. This patch
fixes this issue.
Signed-off-by: Vivek Natarajan <vnatarajan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The variable _data is used in asm-generic to define sections
which causes sparse warnings, so just rename the variable.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add proper RCU annotations/verbs to sk_wq and wq members
Fix __sctp_write_space() sk_sleep() abuse (and sock->wq access)
Fix sunrpc sk_sleep() abuse too
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the bridge multicast snooping feature periodically issues
IPv6 general multicast listener queries to sense the absence of a
listener.
For this, it uses :: as its source address - however RFC 2710 requires:
"To be valid, the Query message MUST come from a link-local IPv6 Source
Address". Current Linux kernel versions seem to follow this requirement
and ignore our bogus MLD queries.
With this commit a link local address from the bridge interface is being
used to issue the MLD query, resulting in other Linux devices which are
multicast listeners in the network to respond with a MLD response (which
was not the case before).
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Map the IPv6 header's destination multicast address to an ethernet
source address instead of the MLD queries multicast address.
For instance for a general MLD query (multicast address in the MLD query
set to ::), this would wrongly be mapped to 33:33:00:00:00:00, although
an MLD queries destination MAC should always be 33:33:00:00:00:01 which
matches the IPv6 header's multicast destination ff02::1.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently the multicast bridge snooping support is not active for
link local multicast. I assume this has been done to leave
important multicast data untouched, like IPv6 Neighborhood Discovery.
In larger, bridged, local networks it could however be desirable to
optimize for instance local multicast audio/video streaming too.
With the transient flag in IPv6 multicast addresses we have an easy
way to optimize such multimedia traffic without tempering with the
high priority multicast data from well-known addresses.
This patch alters the multicast bridge snooping for IPv6, to take
effect for transient multicast addresses instead of non-link-local
addresses.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The nsrcs number is 2 Byte wide, therefore we need to call ntohs()
before using it.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
We actually want a pointer to the grec_nsrcr and not the following
field. Otherwise we can get very high values for *nsrcs as the first two
bytes of the IPv6 multicast address are being used instead, leading to
a failing pskb_may_pull() which results in MLDv2 reports not being
parsed.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The protocol type for IPv6 entries in the hash table for multicast
bridge snooping is falsely set to ETH_P_IP, marking it as an IPv4
address, instead of setting it to ETH_P_IPV6, which results in negative
look-ups in the hash table later.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linux-next as of 20110217 complains when building for OMAP1.
LD vmlinux
`hci_sock_cleanup' referenced in section `.init.text' of net/built-in.o: defined in discarded section `.exit.text' of net/built-in.o
`hci_sock_cleanup' referenced in section `.init.text' of net/built-in.o: defined in discarded section `.exit.text' of net/built-in.o
make: *** [vmlinux] Error 1
A recent patch by Gustavo (Bluetooth: Merge L2CAP and SCO modules
into bluetooth.ko) introduced this by calling the hci_sock_cleanup
function in the error path of bt_init.
Fix this by dropping the __exit marking for hci_sock_cleanup.
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
The current implementation of hidp_output_raw_report() relies only on
the Control channel even for Output reports, and the BT HID
specification [1] does not mention using the DATA message for Output
reports on the Control channel (see section 7.9.1 and also Figure 11:
SET_ Flow Chart), so let us just use SET_REPORT.
This also fixes sending Output reports to some devices (like Sony
Sixaxis) which are not able to handle DATA messages on the Control
channel.
Ideally hidp_output_raw_report() could be improved to use this scheme:
Feature Report -- SET_REPORT on the Control channel
Output Report -- DATA on the Interrupt channel
for more efficiency, but as said above, right now only the Control
channel is used.
[1] http://www.bluetooth.com/Specification%20Documents/HID_SPEC_V10.pdf
Signed-off-by: Antonio Ospite <ospite@studenti.unina.it>
Acked-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch prevents a crash when remote host tries to create a LE
link which already exists. i.e.: call l2test twice passing the
same parameters.
Signed-off-by: Anderson Briglia <anderson.briglia@openbossa.org>
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
All of the places that need to call mgmt_pending_remove already have a
pointer to the pending command, so searching for the command in the list
doesn't make sense. The added benefit is that many places that
previously had to call list_del + mgmt_pending_free can just call
mgmt_pending_remove now.
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
The remote authentication requirements for conections need to be
initialized to 0xff (unknown) since it is possible that we receive a IO
Capability Request before we have received information about the remote
requirements.
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
To properly track bonding completion an event to indicate authentication
failure is needed. This event will be sent whenever an authentication
complete HCI event with a non-zero status comes. It will also be sent
when we're acting in acceptor role for SSP authentication in which case
the controller will send a Simple Pairing Complete event.
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
The command complete event for mgmt_pin_code_reply &
mgmt_pin_code_neg_reply should have the adapter index, Bluetooth address
as well as the status.
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
The opcode for the ENODEV case was wrong (probably copy-paste mistake).
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch adds support for the user confirmation (numeric comparison)
Secure Simple Pairing authentication method.
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch adds a new mgmt_pair_device which can be used to initiate a
dedicated bonding procedure. Some extra callbacks are added to the
hci_conn struct so that the pairing code can get notified of the
completion of the procedure.
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This makes it more convenient to do manipulations on the entry (needed
by later commits).
Signed-off-by: Johan Hedberg <johan.hedberg@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Fix a bug that undo_retrans is incorrectly decremented when undo_marker is
not set or undo_retrans is already 0. This happens when sender receives
more DSACK ACKs than packets retransmitted during the current
undo phase. This may also happen when sender receives DSACK after
the undo operation is completed or cancelled.
Fix another bug that undo_retrans is incorrectly incremented when
sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case
is rare but not impossible.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Eric W. Biederman <ebiederm@xmission.com>
In the beginning with batching unreg_list was a list that was used only
once in the lifetime of a network device (I think). Now we have calls
using the unreg_list that can happen multiple times in the life of a
network device like dev_deactivate and dev_close that are also using the
unreg_list. In addition in unregister_netdevice_queue we also do a
list_move because for devices like veth pairs it is possible that
unregister_netdevice_queue will be called multiple times.
So I think the change below to fix dev_deactivate which Eric D. missed
will fix this problem. Now to go test that.
Signed-off-by: David S. Miller <davem@davemloft.net>
net/sctp/tsnmap.c: In function ‘sctp_tsnmap_num_gabs’:
net/sctp/tsnmap.c:347: warning: ‘start’ may be used uninitialized in this function
net/sctp/tsnmap.c:347: warning: ‘end’ may be used uninitialized in this function
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now, TCP_CHECK_TIMER is not used for debuging, it does nothing.
And, it has been there for several years, maybe 6 years.
Remove it to keep code clearer.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 5fa782c2f5 re-worked the
handling of unknown parameters. sctp_init_cause_fixed() can now
return -ENOSPC if there is not enough tailroom in the error
chunk skb. When this happens, the error header is not appended to
the error chunk. In that case, the payload of the unknown parameter
should not be appended either.
Signed-off-by: Jiri Bohac <jbohac@suse.cz>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biederman reported a lockdep splat in inet_twsk_deschedule()
This is caused by inet_twsk_purge(), run from process context,
and commit 575f4cd5a5 (net: Use rcu lookups in inet_twsk_purge.)
removed the BH disabling that was necessary.
Add the BH disabling but fine grained, right before calling
inet_twsk_deschedule(), instead of whole function.
With help from Linus Torvalds and Eric W. Biederman
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Daniel Lezcano <daniel.lezcano@free.fr>
CC: Pavel Emelyanov <xemul@openvz.org>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
CC: stable <stable@kernel.org> (# 2.6.33+)
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
net: deinit automatic LIST_HEAD
net: dont leave active on stack LIST_HEAD
net: provide default_advmss() methods to blackhole dst_ops
tg3: Restrict phy ioctl access
drivers/net: Call netif_carrier_off at the end of the probe
ixgbe: work around for DDP last buffer size
ixgbe: fix panic due to uninitialised pointer
e1000e: flush all writebacks before unload
e1000e: check down flag in tasks
isdn: hisax: Use l2headersize() instead of dup (and buggy) func.
arp_notify: unconditionally send gratuitous ARP for NETDEV_NOTIFY_PEERS.
cxgb4vf: Use defined Mailbox Timeout
cxgb4vf: Quiesce Virtual Interfaces on shutdown ...
cxgb4vf: Behave properly when CONFIG_DEBUG_FS isn't defined ...
cxgb4vf: Check driver parameters in the right place ...
pch_gbe: Fix the MAC Address load issue.
iwlwifi: Delete iwl3945_good_plcp_health.
net/can/softing: make CAN_SOFTING_CS depend on CAN_SOFTING
netfilter: nf_iterate: fix incorrect RCU usage
pch_gbe: Fix the issue that the receiving data is not normal.
...
Clear IEEE80211_STA_NULLFUNC_ACKED flag on disabling power
save. Without this fix, there is a chance of setting CONF_PS
before sending nullfunc frame.
Signed-off-by: Vivek Natarajan <vnatarajan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
"def_bool n" without prompt is pointless, this should be just "bool".
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The module parameter ieee80211_disable_40mhz_24ghz
was meant to allow disabling 40 MHz operation in
the 2.4 GHz band by default. However, it is buggy
as implemented because while it advertises to the
AP that the device doesn't support 40 MHz, it will
itself still use 40 MHz configurations.
To fix this, clear the 40 MHz bits from the sband
completely instead of overriding where used.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
commit 9b5e383c11 (net: Introduce
unregister_netdevice_many()) left an active LIST_HEAD() in
rollback_registered(), with possible memory corruption.
Even if device is freed without touching its unreg_list (and therefore
touching the previous memory location holding LISTE_HEAD(single), better
close the bug for good, since its really subtle.
(Same fix for default_device_exit_batch() for completeness)
Reported-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Eric W. Biderman <ebiderman@xmission.com>
Tested-by: Eric W. Biderman <ebiderman@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Octavian Purdila <opurdila@ixiacom.com>
CC: stable <stable@kernel.org> [.33+]
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric W. Biderman and Michal Hocko reported various memory corruptions
that we suspected to be related to a LIST head located on stack, that
was manipulated after thread left function frame (and eventually exited,
so its stack was freed and reused).
Eric Dumazet suggested the problem was probably coming from commit
443457242b (net: factorize
sync-rcu call in unregister_netdevice_many)
This patch fixes __dev_close() and dev_close() to properly deinit their
respective LIST_HEAD(single) before exiting.
References: https://lkml.org/lkml/2011/2/16/304
References: https://lkml.org/lkml/2011/2/14/223
Reported-by: Michal Hocko <mhocko@suse.cz>
Tested-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Eric W. Biderman <ebiderman@xmission.com>
Tested-by: Eric W. Biderman <ebiderman@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 0dbaee3b37 (net: Abstract default ADVMSS behind an
accessor.) introduced a possible crash in tcp_connect_init(), when
dst->default_advmss() is called from dst_metric_advmss()
Reported-by: George Spelvin <linux@horizon.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only troublesome bit here is __mkroute_output which wants
to override res->fi and res->type, compute those in local
variables instead.
Signed-off-by: David S. Miller <davem@davemloft.net>
GCC emits all kinds of crazy zero extensions when we go from signed
int, to unsigned short, etc. etc.
This transformation has to be legal because:
1) In tkey_extract_bits() in mask_pfx(), the values are used to
perform shifts, on which negative values are undefined by C.
2) In fib_table_lookup() we perform comparisons with unsigned
values, constants, and additions. None of which should
encounter negative values.
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows avoiding multiple writes to the initial __refcnt.
The most simplest cases of wanting an initial reference of "1"
in ipv4 and ipv6 have been converted, the rest have been left
along and kept at the existing "0".
Signed-off-by: David S. Miller <davem@davemloft.net>
This also allows us to combine all the dst->flags settings and avoid
read/modify/write sequences to this struct member.
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a lot of redundancy and unnecessary stack frames
in the output route creation path.
1) Make __mkroute_output() return error pointers.
2) Eliminate ip_mkroute_output() entirely, made possible by #1.
3) Call __mkroute_output() directly and handling the returning error
pointers in ip_route_output_slow().
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce NETIF_F_RXCSUM to replace device-private flags for RX checksum
offload. Integrate it with ndo_fix_features.
ethtool_op_get_rx_csum() is removed altogether as nothing in-tree uses it.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
This introduces a new framework to handle device features setting.
It consists of:
- new fields in struct net_device:
+ hw_features - features that hw/driver supports toggling
+ wanted_features - features that user wants enabled, when possible
- new netdev_ops:
+ feat = ndo_fix_features(dev, feat) - API checking constraints for
enabling features or their combinations
+ ndo_set_features(dev) - API updating hardware state to match
changed dev->features
- new ethtool commands:
+ ETHTOOL_GFEATURES/ETHTOOL_SFEATURES: get/set dev->wanted_features
and trigger device reconfiguration if resulting dev->features
changed
+ ETHTOOL_GSTRINGS(ETH_SS_FEATURES): get feature bits names (meaning)
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
This allows to enable GRO even if RX csum is disabled. GRO will not
be used for packets without hardware checksum anyway.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is needed for unified offloads patch.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only oddities here are a couple of drivers that bogusly called the ldisc
helpers instead of returning -ENOIOCTLCMD. Fix the bug and the rest goes
away.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Doing tiocmget was such fun we should do tiocmset as well for the same
reasons
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We don't actually need this and it causes problems for internal use of
this functionality. Currently there is a single use of the FILE * pointer.
That is the serial core which uses it to check tty_hung_up_p. However if
that is true then IO_ERROR is also already set so the check may be removed.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The flaw was in skipping the second byte in MAC header due to increasing
the pointer AND indexed access starting at '1'.
Signed-off-by: Joerg Marx <joerg.marx@secunet.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
As warned by checkpatch.pl, use #include <linux/uaccess.h> instead of
<asm/uaccess.h>.
Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Assigning a socket in timewait state to skb->sk can trigger
kernel oops, e.g. in nfnetlink_log, which does:
if (skb->sk) {
read_lock_bh(&skb->sk->sk_callback_lock);
if (skb->sk->sk_socket && skb->sk->sk_socket->file) ...
in the timewait case, accessing sk->sk_callback_lock and sk->sk_socket
is invalid.
Either all of these spots will need to add a test for sk->sk_state != TCP_TIME_WAIT,
or xt_TPROXY must not assign a timewait socket to skb->sk.
This does the latter.
If a TW socket is found, assign the tproxy nfmark, but skip the skb->sk assignment,
thus mimicking behaviour of a '-m socket .. -j MARK/ACCEPT' re-routing rule.
The 'SYN to TW socket' case is left unchanged -- we try to redirect to the
listener socket.
Cc: Balazs Scheidler <bazsi@balabit.hu>
Cc: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
If the new connection update parameter are accepted, the LE master
host sends the LE Connection Update Command to its controller informing
the new requested parameters.
Signed-off-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Use proper timer instead of hci command flow control to timeout
failed hci commands. Otherwise stack ends up sending commands
when flow control is used to block new commands.
2010-09-01 18:29:41.592132 < HCI Command: Remote Name Request (0x01|0x0019) plen 10
bdaddr 00:16:CF:E1:C7:D7 mode 2 clkoffset 0x0000
2010-09-01 18:29:41.592681 > HCI Event: Command Status (0x0f) plen 4
Remote Name Request (0x01|0x0019) status 0x00 ncmd 0
2010-09-01 18:29:51.022033 < HCI Command: Remote Name Request Cancel (0x01|0x001a) plen 6
bdaddr 00:16:CF:E1:C7:D7
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
If the fail happens the HCI del_timer may timeout after the the hci dev
unregister. This lead to a kernel crash.
Reported-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Implements L2CAP Connection Parameter Update Response defined in
the Bluetooth Core Specification, Volume 3, Part A, section 4.21.
Address the LE Connection Parameter Procedure initiated by the slave.
Connection Interval Minimum and Maximum have the same range: 6 to
3200. Time = N * 1.25ms. Minimum shall be less or equal to Maximum.
The Slave Latency field shall have a value in the range of 0 to
((connSupervisionTimeout / connIntervalMax) - 1). Latency field shall
be less than 500. connSupervisionTimeout = Timeout Multiplier * 10 ms.
Multiplier field shall have a value in the range of 10 to 3200.
Signed-off-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch splits the L2CAP command handling function in order to
have a clear separation between the commands related to BR/EDR and
LE. Commands and responses in the LE signaling channel are not being
handled yet, command reject is sent to all received requests. Bluetooth
Core Specification, Volume 3, Part A, section 4 defines the signaling
packets formats and allowed commands/responses over the LE signaling
channel.
Signed-off-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Separate LE and ACL timeouts. Othervise ACL connections
on non LE hw will time out after 45 secs.
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Fix LE connections not being marked as master.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
l2cap over LE links can be disconnected without sending
disconnect command first.
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Add support for LE server sockets.
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Add basic LE connection support to L2CAP. LE
connection can be created by specifying cid
in struct sockaddr_l2
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Bluetooth chips may have separate buffers for LE traffic.
This patch add support to use LE buffers provided by the chip.
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Bluetooth V4.0 adds support for Low Energy (LE) connections.
Specification introduces new set of hci commands to control LE
connection. This patch adds logic to create, cancel and disconnect
LE connections.
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
When IP_VS schedulers do not find a destination, they output a terse
"WLC: no destination available" message through kernel syslog, which I
can not only make sense of because syslog puts them in a logfile
together with keepalived checker results.
This patch makes the output a bit more informative, by telling you which
virtual service failed to find a destination.
Example output:
kernel: [1539214.552233] IPVS: wlc: TCP 192.168.8.30:22 - no destination available
kernel: [1539299.674418] IPVS: wlc: FWM 22 0x00000016 - no destination available
I have tested the code for IPv4 and FWM services, as you can see from
the example; I do not have an IPv6 setup to test the third code path
with.
To avoid code duplication, I put a new function ip_vs_scheduler_err()
into ip_vs_sched.c, and use that from the schedulers instead of calling
IP_VS_ERR_RL directly.
Signed-off-by: Patrick Schaaf <netdev@bof.de>
Signed-off-by: Simon Horman <horms@verge.net.au>
Remove code that should not be called anymore.
Now when ip_vs_out handles replies for local clients at
LOCAL_IN hook we do not need to call conn_out_get and
handle_response_icmp from ip_vs_in_icmp* because such
lookups were already performed for the ICMP packet and no
connection was found.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Fix get_curr_sync_buff to keep buffer for 2 seconds
as intended, not just for the current jiffie. By this way
we will sync more connection structures with single packet.
Signed-off-by: Tinggong Wang <wangtinggong@gmail.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
For testing and debugging purposes it is useful to be able to disable
hardware acceleration of RFS without disabling RFS altogether. Since
this is a similar feature to 'n-tuple' flow steering through the
ethtool API, test the same feature flag that controls that.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
If the root qdisc for a net device is mqprio, and the driver's
ndo_setup_tc() operation dynamically adds and remvoes TX queues,
netif_set_real_num_tx_queues() will be called during device
unregistration to remove the extra TX queues when the qdisc is
destroyed. Currently this causes the corresponding kobjects
to be leaked, and the device's reference count never drops to 0.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
l2cap_load() was added to trigger l2cap.ko module loading from the RFCOMM
and BNEP modules. Now that L2CAP module is gone, we don't need it anymore.
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Note that we do not generate the redirect netevent any longer,
because we don't create a new cached route.
Instead, once the new neighbour is bound to the cached route,
we emit a neigh update event instead.
Signed-off-by: David S. Miller <davem@davemloft.net>
The general idea is that if we learn new PMTU information, we
bump the peer genid.
This triggers the dst_ops->check() code to validate and if
necessary propagate the new PMTU value into the metrics.
Learned PMTU information self-expires.
This means that it is not necessary to kill a cached route
entry just because the PMTU information is too old.
As a consequence:
1) When the path appears unreachable (dst_ops->link_failure
or dst_ops->negative_advice) we unwind the PMTU state if
it is out of date, instead of killing the cached route.
A redirected route will still be invalidated in these
situations.
2) rt_check_expire(), rt_worker_func(), et al. are no longer
necessary at all.
Signed-off-by: David S. Miller <davem@davemloft.net>
NETDEV_NOTIFY_PEER is an explicit request by the driver to send a link
notification while NETDEV_UP/NETDEV_CHANGEADDR generate link
notifications as a sort of side effect.
In the later cases the sysctl option is present because link
notification events can have undesired effects e.g. if the link is
flapping. I don't think this applies in the case of an explicit
request from a driver.
This patch makes NETDEV_NOTIFY_PEER unconditional, if preferred we
could add a new sysctl for this case which defaults to on.
This change causes Xen post-migration ARP notifications (which cause
switches to relearn their MAC tables etc) to be sent by default.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove duplicate inclusion of "send.h" and "routing.h" from
net/batman-adv/soft-interface.c
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
With previous patch, rose_get_neigh() routine
investigates the full list of neighbor nodes
until it finds or not an already connected node whether
it is called locally or through a level 3 transit frame.
If no routes are opened through an adjacent connected node
then a classical connect request is attempted.
Then there is no more reason for an extra loop such
as the one removed by this patch.
Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
FPAC AX25 packet application is using Linux kernel ROSE
routing skills in order to connect or send packets to remote stations
knowing their ROSE address via a network of interconnected nodes.
Each FPAC node has a ROSE routing table that Linux ROSE module is
looking at each time a ROSE frame is relayed by the node or when
a connect request to a neighbor node is received.
A previous patch improved the system time response by looking at
already established routes each time the system was looking for a
route to relay a frame. If a neighbor node routing the destination
address was already connected, then the frame would be sent
through him. If not, a connection request would be issued.
The present patch extends the same routing capability to a connect
request asked by a user locally connected into an FPAC node.
Without this patch, a connect request was not well handled unless it
was directed to an immediate connected neighbor of the local node.
Implemented at a number of ROSE FPAC node stations, the present patch
improved dramatically FPAC ROSE routing time response and efficiency.
Signed-off-by: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
WFA certification and the WMM spec require that we
always reply to unicast probe requests, so do that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ieee80211_rx_h_check returned RX_DROP_MONITOR in case the if statement
in question was true but the same return value is also used directly
after the if clause. Hence, we can just drop the whole if clause and as
such simplify the code.
Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Actually doesn't make sense have these modules built separately.
The L2CAP layer is needed by almost all Bluetooth protocols and profiles.
There isn't any real use case without having L2CAP loaded.
SCO is only essential for Audio transfers, but it is so small that we can
have it loaded always in bluetooth.ko without problems.
If you really doesn't want it you can disable SCO in the kernel config.
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Commit 0c838ff1ad (ipv4: Consolidate all default route selection
implementations.) forgot to remove one rcu_read_unlock() from
fib_select_default().
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All the cleanup code in mqprio_destroy() is currently conditional on
priv->qdiscs being non-null, but that condition should only apply to
the per-queue qdisc cleanup. We should always set the number of
traffic classes back to 0 here.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
As noticed by Eric, nf_iterate doesn't use RCU correctly by
accessing the prev pointer of a RCU protected list element when
a verdict of NF_REPEAT is issued.
Fix by jumping backwards to the hook invocation directly instead
of loading the previous list element before continuing the list
iteration.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
This reverts commit 44bd4de9c2.
I have to revert the early loop termination in connlimit since it generates
problems when an iptables statement does not use -m state --state NEW before
the connlimit match extension.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Struct tmp is copied from userspace. It is not checked whether the "name"
field is NULL terminated. This may lead to buffer overflow and passing
contents of kernel stack as a module name to try_then_request_module() and,
consequently, to modprobe commandline. It would be seen by all userspace
processes.
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
struct sco_conninfo has one padding byte in the end. Local variable
cinfo of type sco_conninfo is copied to userspace with this uninizialized
one byte, leading to old stack contents leak.
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>