Keep track of when an outgoing message is ACKed (i.e., the server fully
received it and, presumably, queued it for processing). Time out OSD
requests only if it's been too long since they've been received.
This prevents timeouts and connection thrashing when the OSDs are simply
busy and are throttling the requests they read off the network.
Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net>
Signed-off-by: Sage Weil <sage@newdream.net>
Workloads using pipes and sockets hit inode_sb_list_lock contention.
superblock s_inodes list is needed for quota, dirty, pagecache and
fsnotify management. pipe/anon/socket fs are clearly not candidates for
these.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'for-3.1' of git://linux-nfs.org/~bfields/linux:
nfsd: don't break lease on CLAIM_DELEGATE_CUR
locks: rename lock-manager ops
nfsd4: update nfsv4.1 implementation notes
nfsd: turn on reply cache for NFSv4
nfsd4: call nfsd4_release_compoundargs from pc_release
nfsd41: Deny new lock before RECLAIM_COMPLETE done
fs: locks: remove init_once
nfsd41: check the size of request
nfsd41: error out when client sets maxreq_sz or maxresp_sz too small
nfsd4: fix file leak on open_downgrade
nfsd4: remember to put RW access on stateid destruction
NFSD: Added TEST_STATEID operation
NFSD: added FREE_STATEID operation
svcrpc: fix list-corrupting race on nfsd shutdown
rpc: allow autoloading of gss mechanisms
svcauth_unix.c: quiet sparse noise
svcsock.c: include sunrpc.h to quiet sparse noise
nfsd: Remove deprecated nfsctl system call and related code.
NFSD: allow OP_DESTROY_CLIENTID to be only op in COMPOUND
Fix up trivial conflicts in Documentation/feature-removal-schedule.txt
* Merge akpm patch series: (122 commits)
drivers/connector/cn_proc.c: remove unused local
Documentation/SubmitChecklist: add RCU debug config options
reiserfs: use hweight_long()
reiserfs: use proper little-endian bitops
pnpacpi: register disabled resources
drivers/rtc/rtc-tegra.c: properly initialize spinlock
drivers/rtc/rtc-twl.c: check return value of twl_rtc_write_u8() in twl_rtc_set_time()
drivers/rtc: add support for Qualcomm PMIC8xxx RTC
drivers/rtc/rtc-s3c.c: support clock gating
drivers/rtc/rtc-mpc5121.c: add support for RTC on MPC5200
init: skip calibration delay if previously done
misc/eeprom: add eeprom access driver for digsy_mtc board
misc/eeprom: add driver for microwire 93xx46 EEPROMs
checkpatch.pl: update $logFunctions
checkpatch: make utf-8 test --strict
checkpatch.pl: add ability to ignore various messages
checkpatch: add a "prefer __aligned" check
checkpatch: validate signature styles and To: and Cc: lines
checkpatch: add __rcu as a sparse modifier
checkpatch: suggest using min_t or max_t
...
Did this as a merge because of (trivial) conflicts in
- Documentation/feature-removal-schedule.txt
- arch/xtensa/include/asm/uaccess.h
that were just easier to fix up in the merge than in the patch series.
We presently define all kinds of notifiers in notifier.h. This is not
necessary at all, since different subsystems use different notifiers, they
are almost non-related with each other.
This can also save much build time. Suppose I add a new netdevice event,
really I don't have to recompile all the source, just network related.
Without this patch, all the source will be recompiled.
I move the notify events near to their subsystem notifier registers, so
that they can be found more easily.
This patch:
It is not necessary to share the same notifier.h.
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No need to use int, its uses are boolean.
May save a few bytes one day.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Randy Dunlap <rdunlap@xenotime.net>
Fix new kernel-doc warning in eth.c:
Warning(net/ethernet/eth.c:237): No description found for parameter 'type'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a device event generates gratuitous ARP messages, only primary
address is used for sending. This patch iterates through the whole
list. Tested with 2 IP addresses configuration on bonding interface.
Signed-off-by: Zoltan Kiss <schaman@sch.bme.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Original commit 2bda8a0c8af... "Disable router anycast
address for /127 prefixes" says:
| No need for matching code in addrconf_leave_anycast() as it
| will silently ignore any attempt to leave an unknown anycast
| address.
After analysis, because 1) we may add two or more prefixes on the
same interface, or 2)user may have manually joined that anycast,
we may hit chances to have anycast address which as if we had
generated one by /127 prefix and we should not leave from subnet-
router anycast address unconditionally.
CC: Bjørn Mork <bjorn@mork.no>
CC: Brian Haley <brian.haley@hp.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
fs: Merge split strings
treewide: fix potentially dangerous trailing ';' in #defined values/expressions
uwb: Fix misspelling of neighbourhood in comment
net, netfilter: Remove redundant goto in ebt_ulog_packet
trivial: don't touch files that are removed in the staging tree
lib/vsprintf: replace link to Draft by final RFC number
doc: Kconfig: `to be' -> `be'
doc: Kconfig: Typo: square -> squared
doc: Konfig: Documentation/power/{pm => apm-acpi}.txt
drivers/net: static should be at beginning of declaration
drivers/media: static should be at beginning of declaration
drivers/i2c: static should be at beginning of declaration
XTENSA: static should be at beginning of declaration
SH: static should be at beginning of declaration
MIPS: static should be at beginning of declaration
ARM: static should be at beginning of declaration
rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
Update my e-mail address
PCIe ASPM: forcedly -> forcibly
gma500: push through device driver tree
...
Fix up trivial conflicts:
- arch/arm/mach-ep93xx/dma-m2p.c (deleted)
- drivers/gpio/gpio-ep93xx.c (renamed and context nearby)
- drivers/net/r8169.c (just context changes)
Our performance team has noticed that increasing
RPCRDMA_MAX_DATA_SEGS from 8 to 64 significantly
increases throughput when using the RDMA transport.
Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (145 commits)
bnx2x: use pci_pcie_cap()
bnx2x: fix bnx2x_stop_on_error flow in bnx2x_sp_rtnl_task
bnx2x: enable internal target-read for 57712 and up only
bnx2x: count statistic ramrods on EQ to prevent MC assert
bnx2x: fix loopback for non 10G link
bnx2x: dcb - send all unmapped priorities to same COS as L2
iwlwifi: Fix build with CONFIG_PM disabled.
gre: fix improper error handling
ipv4: use RT_TOS after some rt_tos conversions
via-velocity: remove duplicated #include
qlge: remove duplicated #include
igb: remove duplicated #include
can: c_can: remove duplicated #include
bnad: remove duplicated #include
net: allow netif_carrier to be called safely from IRQ
bna: Header File Consolidation
bna: HW Error Counter Fix
bna: Add HW Semaphore Unlock Logic
bna: IOC Event Name Change
bna: Mboxq Flush When IOC Disabled
...
Do not set the cr0 enablement bit for iucv by default in head[31|64].S,
move the enablement to iucv_init in the iucv base layer.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix improper protocol err_handler, current implementation is fully
unapplicable and may cause kernel crash due to double kfree_skb.
Signed-off-by: Dmitry Kozlov <xeb@mail.ru>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rt_tos was changed to iph->tos but it must be filtered by RT_TOS
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
msize represents the maximum PDU size that includes P9_IOHDRSZ.
Signed-off-by: Venkateswararao Jujjuri "<jvrao@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
unlinkat - Remove a directory entry
size[4] Tunlinkat tag[2] dirfid[4] name[s] flag[4]
size[4] Runlinkat tag[2]
older Tremove have the below request format
size[4] Tremove tag[2] fid[4]
The remove message is used to remove a directory entry either file or directory
The remove opreation is actually a directory opertation and should ideally have
dirfid, if not we cannot represent the fid on server with anything other than
name. We will have to derive the directory name from fid in the Tremove request.
NOTE: The operation doesn't clunk the unlink fid.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
renameat - change name of file or directory
size[4] Trenameat tag[2] olddirfid[4] oldname[s] newdirfid[4] newname[s]
size[4] Rrenameat tag[2]
older Trename have the below request format
size[4] Trename tag[2] fid[4] newdirfid[4] name[s]
The rename message is used to change the name of a file, possibly moving it
to a new directory. The rename opreation is actually a directory opertation
and should ideally have olddirfid, if not we cannot represent the fid on server
with anything other than name. We will have to derive the old directory name
from fid in the Trename request.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
free the fid even in case of failed clunk.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Switch to generic kernel hexdump library and cleanup macros to
be more consistent with the way we do normal debug prints.
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
There was a BUG_ON to protect against a bad id which could be dealt with
more gracefully.
Reported-by: Natalie Orlin <norlin@us.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (107 commits)
vfs: use ERR_CAST for err-ptr tossing in lookup_instantiate_filp
isofs: Remove global fs lock
jffs2: fix IN_DELETE_SELF on overwriting rename() killing a directory
fix IN_DELETE_SELF on overwriting rename() on ramfs et.al.
mm/truncate.c: fix build for CONFIG_BLOCK not enabled
fs:update the NOTE of the file_operations structure
Remove dead code in dget_parent()
AFS: Fix silly characters in a comment
switch d_add_ci() to d_splice_alias() in "found negative" case as well
simplify gfs2_lookup()
jfs_lookup(): don't bother with . or ..
get rid of useless dget_parent() in btrfs rename() and link()
get rid of useless dget_parent() in fs/btrfs/ioctl.c
fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers
drivers: fix up various ->llseek() implementations
fs: handle SEEK_HOLE/SEEK_DATA properly in all fs's that define their own llseek
Ext4: handle SEEK_HOLE/SEEK_DATA generically
Btrfs: implement our own ->llseek
fs: add SEEK_HOLE and SEEK_DATA flags
reiserfs: make reiserfs default to barrier=flush
...
Fix up trivial conflicts in fs/xfs/linux-2.6/xfs_super.c due to the new
shrinker callout for the inode cache, that clashed with the xfs code to
start the periodic workers later.
As reported by Ben Greer and Froncois Romieu. The code path in
the netif_carrier code leads it to try and disable
a late workqueue to reenable it immediately
netif_carrier_on
-> linkwatch_fire_event
-> linkwatch_schedule_work
-> cancel_delayed_work
-> del_timer_sync
If __cancel_delayed_work is used instead then there is no
problem of waiting for running linkwatch_event.
There is a race between linkwatch_event running re-scheduling
but it is harmless to schedule an extra scan of the linkwatch queue.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some minor cleanups that won't impact code:
1. Remove inline from non-critical functions; compiler will most
likely inline them anyway.
2. Make function args const where possible.
3. Whitespace cleanup
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When STP changes state of interface need to send a new link
message to reflect that change.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a new device is added to a bridge, the ethernet address of the
bridge network device may change. When the address changes, the
appropriate callback is called, but with the wrong device argument.
The address of the bridge device (ie br0) changes not the address
of the device being passed to add_if (ie eth0).
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the message_age is already greater than the max_age, then the
BPDU is bogus. Linux won't generate BPDU, but conformance tester
or buggy implementation might.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A bridge topology with three systems:
+------+ +------+
| A(2) |--| B(1) |
+------+ +------+
\ /
+------+
| C(3) |
+------+
What is supposed to happen:
* bridge with the lowest ID is elected root (for example: B)
* C detects that A->C is higher cost path and puts in blocking state
What happens. Bridge with lowest id (B) is elected correctly as
root and things start out fine initially. But then config BPDU
doesn't get transmitted from A -> C. Because of that
the link from A-C is transistioned to the forwarding state.
The root cause of this is that the configuration messages
is generated with bogus message age, and dropped before
sending.
In the standardmessage_age is supposed to be:
the time since the generation of the Configuration BPDU by
the Root that instigated the generation of this Configuration BPDU.
Reimplement this by recording the timestamp (age + jiffies) when
recording config information. The old code incorrectly used the time
elapsed on the ageing timer which was incorrect.
See also:
https://bugzilla.vyatta.com/show_bug.cgi?id=7164
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Fix wrong check in list_splice_init_rcu()
net,rcu: Convert call_rcu(xt_rateest_free_rcu) to kfree_rcu()
sysctl,rcu: Convert call_rcu(free_head) to kfree
vmalloc,rcu: Convert call_rcu(rcu_free_vb) to kfree_rcu()
vmalloc,rcu: Convert call_rcu(rcu_free_va) to kfree_rcu()
ipc,rcu: Convert call_rcu(ipc_immediate_free) to kfree_rcu()
ipc,rcu: Convert call_rcu(free_un) to kfree_rcu()
security,rcu: Convert call_rcu(sel_netport_free) to kfree_rcu()
security,rcu: Convert call_rcu(sel_netnode_free) to kfree_rcu()
ia64,rcu: Convert call_rcu(sn_irq_info_free) to kfree_rcu()
block,rcu: Convert call_rcu(disk_free_ptbl_rcu_cb) to kfree_rcu()
scsi,rcu: Convert call_rcu(fc_rport_free_rcu) to kfree_rcu()
audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu()
security,rcu: Convert call_rcu(whitelist_item_free) to kfree_rcu()
md,rcu: Convert call_rcu(free_conf) to kfree_rcu()
icmp_route_lookup() uses the wrong flow parameters if the reverse
session route lookup isn't used.
So do not commit to the re-decoded flow until we actually make a
final decision to use a real route saved in 'rt2'.
Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because the ip fragment offset field counts 8-byte chunks, ip
fragments other than the last must contain a multiple of 8 bytes of
payload. ip_ufo_append_data wasn't respecting this constraint and,
depending on the MTU and ip option sizes, could create malformed
non-final fragments.
Google-Bug-Id: 5009328
Signed-off-by: Bill Sommerfeld <wsommerfeld@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPv6 fragment identification generation is way beyond what we use for
IPv4 : It uses a single generator. Its not scalable and allows DOS
attacks.
Now inetpeer is IPv6 aware, we can use it to provide a more secure and
scalable frag ident generator (per destination, instead of system wide)
This patch :
1) defines a new secure_ipv6_id() helper
2) extends inet_getid() to provide 32bit results
3) extends ipv6_select_ident() with a new dest parameter
Reported-by: Fernando Gont <fernando@gont.com.ar>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently cow metrics a bit too soon in IPv6 case : All routes are
tied to a single inetpeer entry.
Change ip6_rt_copy() to get destination address as second argument, so
that we fill rt6i_dst before the dst_copy_metrics() call.
icmp6_dst_alloc() must set rt6i_dst before calling dst_metric_set(), or
else the cow is done while rt6i_dst is still NULL.
If orig route points to readonly metrics, we can share the pointer
instead of performing the memory allocation and copy.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This resolves a panic on module removal.
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Some drivers (ab)use the ethtool_ops::get_regs operation to expose
only a hardware revision ID. Commit
a77f5db361 ('ethtool: Allocate register
dump buffer with vmalloc()') had the side-effect of breaking these, as
vmalloc() returns a null pointer for size=0 whereas kmalloc() did not.
For backward-compatibility, allow zero-length dumps again.
Reported-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Cc: stable@kernel.org [2.6.37+]
Signed-off-by: David S. Miller <davem@davemloft.net>
There are two problems:
1) "n" was allocated with alloc_skb() so we should free it with
kfree_skb() instead of regular kfree().
2) We return the freed pointer instead of NULL.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since vlan_group_get_device and vlan_group is not going to be accessible
from device drivers, introduce function which substitutes it.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
All these are instances of
#define NAME value;
or
#define NAME(params_opt) value;
These of course fail to build when used in contexts like
if(foo $OP NAME)
while(bar $OP NAME)
and may silently generate the wrong code in contexts such as
foo = NAME + 1; /* foo = value; + 1; */
bar = NAME - 1; /* bar = value; - 1; */
baz = NAME & quux; /* baz = value; & quux; */
Reported on comp.lang.c,
Message-ID: <ab0d55fe-25e5-482b-811e-c475aa6065c3@c29g2000yqd.googlegroups.com>
Initial analysis of the dangers provided by Keith Thompson in that thread.
There are many more instances of more complicated macros having unnecessary
trailing semicolons, but this pile seems to be all of the cases of simple
values suffering from the problem. (Thus things that are likely to be found
in one of the contexts above, more complicated ones aren't.)
Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
In net/bridge/netfilter/ebt_ulog.c:ebt_ulog_packet() the 'goto unlock'
before the 'alloc_failure' label is completely redundant. This patch
removes it.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
If overlapping networks with different interfaces was added to
the set, the type did not handle it properly. Example
ipset create test hash:net,iface
ipset add test 192.168.0.0/16,eth0
ipset add test 192.168.0.0/24,eth1
Now, if a packet was sent from 192.168.0.0/24,eth0, the type returned
a match.
In the patch the algorithm is fixed in order to correctly handle
overlapping networks.
Limitation: the same network cannot be stored with more than 64 different
interfaces in a single set.
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The RCU callback xt_rateest_free_rcu() just calls kfree(), so we can
use kfree_rcu() instead of call_rcu(). This also allows us to dispense
with an rcu_barrier() call, speeding up unloading of this module.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Patrick McHardy <kaber@trash.net>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
commit 58389c69150e6032504dfcd3edca6b1975c8b5bc
Author: Johannes Berg <johannes.berg@intel.com>
Date: Mon Jul 18 18:08:35 2011 +0200
cfg80211: allow userspace to control supported rates in scan
made single-band cards crash since it would always
access all wiphy->bands[]. Fix this and reject any
attempts in the new helper ieee80211_get_ratemask()
to do the same, rejecting rates configuration for
unsupported bands.
Reported-by: Pavel Roskin <proski@gnu.org>
Tested-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ieee80211_stop_rx_ba_session() was calling sta_info_get()
without rcu locking, and the return value was not
checked.
This resulted in the following panic:
[<bf05726c>] (ieee80211_stop_rx_ba_session+0x0/0x60 [mac80211])
[<bf0abd94>] (wl1271_event_handle+0x0/0xdc8 [wl12xx])
[<bf0a7308>] (wl1271_irq+0x0/0x4a0 [wl12xx])
[<c00c40a8>] (irq_thread+0x0/0x254)
[<c00a7398>] (kthread+0x0/0x8c)
Signed-off-by: Eliad Peller <eliad@wizery.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
cfg80211_netdev_notifier_call() is configuring psm in case
of NL80211_IFTYPE_STATION interface type (on NETDEV_UP).
do the same for NL80211_IFTYPE_P2P_CLIENT interface type.
Signed-off-by: Eliad Peller <eliad@wizery.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In P2P client mode, the GO (AP) to connect to might
have periods of time where it is not available due
to powersave. To allow the driver to sync with it
and send frames to the GO only when it is available
add a new callback tx_sync (and the corresponding
finish_tx_sync). These callbacks can sleep unlike
the actual TX.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
combination of kern_path_parent() and lookup_create(). Does *not*
expose struct nameidata to caller. Syscalls converted to that...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Scanning currently uses the TX rate mask to
restrict the rate set, which is bogus. Make
it use the new set of rates from userspace.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some P2P scans are not allowed to advertise
11b rates, but that is a rather special case
so instead of having that, allow userspace
to request the rate sets (per band) that are
advertised in scan probe request frames.
Since it's needed in two places now, factor
out some common code parsing a rate array.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
open(2) must always include one of O_RDONLY, O_WRONLY, or O_RDWR. No need
for any O_APPEND special case.
Passing O_WRONLY|O_RDWR is undefined according to the man page, but the
Linux VFS interprets this as O_RDWR, so we'll do the same.
This fixes open(2) with flags O_RDWR|O_APPEND, which was incorrectly being
translated to readonly.
Reported-by: Fyodor Ustinov <ufm@ufm.su>
Signed-off-by: Sage Weil <sage@newdream.net>
Introduces a new nfnetlink type that applies a given
verdict to all queued packets with an id <= the id in the verdict
message.
If a mark is provided it is applied to all matched packets.
This reduces the number of verdicts that have to be sent.
Applications that make use of this feature need to maintain
a timeout to send a batchverdict periodically to avoid starvation.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Packet identifier is currently setup in nfqnl_build_packet_message(),
using one atomic_inc_return().
Problem is that since several cpus might concurrently call
nfqnl_enqueue_packet() for the same queue, we can deliver packets to
consumer in non monotonic way (packet N+1 being delivered after packet
N)
This patch moves the packet id setup from nfqnl_build_packet_message()
to nfqnl_enqueue_packet() to guarantee correct delivery order.
This also removes one atomic operation.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Florian Westphal <fw@strlen.de>
CC: Pablo Neira Ayuso <pablo@netfilter.org>
CC: Eric Leblond <eric@regit.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Add tx_conf array to save the current tx queues
configuration, and reconfig it on resume (ieee80211_reconfig).
On resume, the driver is being reconfigured. Without
reconfiguring the tx queues as well, the driver might
configure the device to use wrong ac params (e.g. ps-poll
instead of uapsd).
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Compiler is not smart enough to avoid double BSWAP instructions in
ntohl(inet_make_mask(plen)).
Lets cache this value in struct leaf_info, (fill a hole on 64bit arches)
With route cache disabled, this saves ~2% of cpu in udpflood bench on
x86_64 machine.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
nenetlink_queue operations on SMP are not efficent if several queues are
used, because of nfnl_mutex contention when applications give packet
verdict.
Use new call_rcu field in struct nfnl_callback to advertize a callback
that is called under rcu_read_lock instead of nfnl_mutex.
On my 2x4x2 machine, I was able to reach 2.000.000 pps going through
user land returning NF_ACCEPT verdicts without losses, instead of less
than 500.000 pps before patch.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Florian Westphal <fw@strlen.de>
CC: Eric Leblond <eric@regit.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Goal of this patch is to permit nfnetlink providers not mandate
nfnl_mutex being held while nfnetlink_rcv_msg() calls them.
If struct nfnl_callback contains a non NULL call_rcu(), then
nfnetlink_rcv_msg() will use it instead of call() field, holding
rcu_read_lock instead of nfnl_mutex
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Florian Westphal <fw@strlen.de>
CC: Eric Leblond <eric@regit.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
In the future dst entries will be neigh-less. In that environment we
need to have an easy transition point for current users of
dst->neighbour outside of the packet output fast path.
Signed-off-by: David S. Miller <davem@davemloft.net>
It just makes it harder to see 1) what the code is doing
and 2) grep for all users of dst{->,.}neighbour
Signed-off-by: David S. Miller <davem@davemloft.net>
This will get us closer to being able to do "neigh stuff"
completely independent of the underlying dst_entry for
protocols (ipv4/ipv6) that wish to do so.
We will also be able to make dst entries neigh-less.
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the caller has to change the value of task->tk_priority if
it wants to select on which priority level the task will sleep.
This patch allows the caller to select a priority level at sleep time
rather than always using task->tk_priority.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This throttles the allocation of new slots when the socket is busy
reconnecting and/or is out of buffer space.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
there is only one user of vlan_find_dev outside of the actual vlan code:
qlcnic uses it to iterate over some VLANs it knows.
let's just make vlan_find_dev private to the VLAN code and have the
iteration in qlcnic be a bit more direct. (a few rcu dereferences less
too)
Signed-off-by: David Lamparter <equinox@diac24.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Amit Kumar Salecha <amit.salecha@qlogic.com>
Cc: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Cc: linux-driver@qlogic.com
Signed-off-by: David S. Miller <davem@davemloft.net>
It's just taking on one of two possible values, either
neigh_ops->output or dev_queue_xmit(). And this is purely depending
upon whether nud_state has NUD_CONNECTED set or not.
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that hh_cache entries are embedded inside of neighbour
entries, their lifetimes and accesses are now synchronous
to that of the encompassing neighbour object.
Therefore we don't need to hook up the blackhole op to
hh_output on destroy.
Signed-off-by: David S. Miller <davem@davemloft.net>
Another regression fix considering incomming l2cap connections with
defer_setup enabled. In situations when incomming connection is
extracted with l2cap_sock_accept, it's bt_sock info will have
'parent' member zerroed, but 'parent' may be used unconditionally
in l2cap_conn_start() and l2cap_security_cfm() when defer_setup
is enabled.
Backtrace:
[<bf02d5ac>] (l2cap_security_cfm+0x0/0x2ac [bluetooth]) from [<bf01f01c>] (hci_event_pac
ket+0xc2c/0x4aa4 [bluetooth])
[<bf01e3f0>] (hci_event_packet+0x0/0x4aa4 [bluetooth]) from [<bf01a844>] (hci_rx_task+0x
cc/0x27c [bluetooth])
[<bf01a778>] (hci_rx_task+0x0/0x27c [bluetooth]) from [<c008eee4>] (tasklet_action+0xa0/
0x15c)
[<c008ee44>] (tasklet_action+0x0/0x15c) from [<c008f38c>] (__do_softirq+0x98/0x130)
r7:00000101 r6:00000018 r5:00000001 r4:efc46000
[<c008f2f4>] (__do_softirq+0x0/0x130) from [<c008f524>] (do_softirq+0x4c/0x58)
[<c008f4d8>] (do_softirq+0x0/0x58) from [<c008f5e0>] (run_ksoftirqd+0xb0/0x1b4)
r4:efc46000 r3:00000001
[<c008f530>] (run_ksoftirqd+0x0/0x1b4) from [<c009f2a8>] (kthread+0x84/0x8c)
r7:00000000 r6:c008f530 r5:efc47fc4 r4:efc41f08
[<c009f224>] (kthread+0x0/0x8c) from [<c008cc84>] (do_exit+0x0/0x5f0)
Signed-off-by: Ilia Kolomisnky <iliak@ti.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Caused by the following commit, partially revert it.
commit 9fa7e4f76f
Author: Gustavo F. Padovan <padovan@profusion.mobi>
Date: Thu Jun 30 16:11:30 2011 -0300
Bluetooth: Fix regression with incoming L2CAP connections
PTS test A2DP/SRC/SRC_SET/TC_SRC_SET_BV_02_I revealed that
( probably after the df3c3931e commit ) the l2cap connection
could not be established in case when the "Auth Complete" HCI
event does not arive before the initiator send "Configuration
request", in which case l2cap replies with "Command rejected"
since the channel is still in BT_CONNECT2 state.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 3262c816a3 "[PATCH] knfsd:
split svc_serv into pools", svc_delete_xprt (then svc_delete_socket) no
longer removed its xpt_ready (then sk_ready) field from whatever list it
was on, noting that there was no point since the whole list was about to
be destroyed anyway.
That was mostly true, but forgot that a few svc_xprt_enqueue()'s might
still be hanging around playing with the about-to-be-destroyed list, and
could get themselves into trouble writing to freed memory if we left
this xprt on the list after freeing it.
(This is actually functionally identical to a patch made first by Ben
Greear, but with more comments.)
Cc: stable@kernel.org
Cc: gnb@fmeh.org
Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Like svcauth_unix, the symbol svcauth_null is used external from this
file. Declare it as extern to quiet the following sparse noise:
warning: symbol 'svcauth_null' was not declared. Should it be static?
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Include the private header sunrpc.h to pickup the declaration of the
function svc_send_common to quiet the following sparse noise:
warning: symbol 'svc_send_common' was not declared. Should it be static?
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
As promised in feature-removal-schedule.txt it is time to
remove the nfsctl system call.
Userspace has perferred to not use this call throughout 2.6 and it has been
excluded in the default configuration since 2.6.36 (9 months ago).
So this patch removes all the code that was being compiled out.
There are still references to sys_nfsctl in various arch systemcall tables
and related code. These should be cleaned out too, probably in the next
merge window.
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
When suspending with all netdevs down, the device
is stopped but we still call a number of driver
callbacks that the driver might not expect. The
same happens during resume, we might call a few
callbacks without starting the driver. Fix this
by checking open_count around more things and
exiting quickly if it is 0.
Also, while at this I noticed that the coverage
class isn't reprogrammed after resume, so add
that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ieee80211_iter_keys() currently returns keys in
the backward order they were installed in, which
is a bit confusing. Add them to the tail of the
key list to make sure iterations go in the same
order that keys were originally installed in.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When the driver wants to pre-program the TKIP
RX phase 1 key, it needs to be able to obtain
it for the peer's TA. Add API to allow it to
generate it.
The generation uses a dummy on-stack context
since it doesn't know the RX queue.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some chips may support different lengths of user-supplied IEs with a
single scheduled scan command than with a single normal scan command.
To support this, this patch creates a separate hardware description
element that describes the maximum size of user-supplied information
element data supported in scheduled scans.
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some chips can scan more SSIDs with a single scheduled scan command
than with a single normal scan command (eg. wl12xx chips).
To support this, this patch creates a separate hardware description
element that describes the amount of SSIDs supported in scheduled
scans.
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Since we now have the necessary API in place to support
GTK rekeying, applications will need to know whether it
is supported by a device. Add a pseudo-trigger that is
used only to advertise that capability. Also, add some
new triggers that match what iwlagn devices can do.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Change explicit references to CONFIG_NFS_V4_1 to implicit ones
Get rid of the unnecessary defines in backchannel_rqst.c and
bc_svc.c: the Makefile takes care of those dependency.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
There is no software fallback implemented for SCTP or FCoE checksumming,
and so it should not be passed on by software devices like bridge or bonding.
For VLAN devices, this is different. First, the driver for underlying device
should be prepared to get offloaded packets even when the feature is disabled
(especially if it advertises it in vlan_features). Second, devices under
VLANs do not get replaced without tearing down the VLAN first.
This fixes a mess I accidentally introduced while converting bonding to
ndo_fix_features.
NETIF_F_SOFT_FEATURES are removed from BOND_VLAN_FEATURES because they
are unused as of commit 712ae51afd.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets to devices without NETIF_F_SCTP_CSUM (including NETIF_F_NO_CSUM)
should be properly checksummed because the packets can be diverted or
rerouted after construction. This still leaves packets diverted from
NETIF_F_SCTP_CSUM-enabled devices with broken checksums. Fixing this
needs implementing software offload fallback in networking core.
For users of sctp_checksum_disable, skb->ip_summed should be left as
CHECKSUM_NONE and not CHECKSUM_UNNECESSARY as per include/linux/skbuff.h.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove it, as it indirectly exposes netdev features. It's not used in
iproute2 (2.6.38) - is anything else using its interface?
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
The same information and more can be obtained by using ethtool
with ETHTOOL_GFEATURES.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is not used anywhere except net/core/dev.c now.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
vlan_features contains features inherited from underlying device.
NETIF_SOFT_FEATURES are not inherited but belong to the vlan device
itself (ensured in vlan_dev_fix_features()).
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the fact that ORing with zero is a no-op.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently we flush tp_status and then flush the remainder of the header+payload.
tp_status should be flushed in the end to avoid stale data being read by user-space.
Incorrectly re-ordered barriers in v1.
Signed-off-by: Chetan Loke <loke.chetan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that there is a one-to-one correspondance between neighbour
and hh_cache entries, we no longer need:
1) dynamic allocation
2) attachment to dst->hh
3) refcounting
Initialization of the hh_cache entry is indicated by hh_len
being non-zero, and such initialization is always done with
the neighbour's lock held as a writer.
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
SUNRPC: Fix use of static variable in rpcb_getport_async
NFSv4.1: update nfs4_fattr_bitmap_maxsz
SUNRPC: Fix a race between work-queue and rpc_killall_tasks
pnfs: write: Set mds_offset in the generic layer - it is needed by all LDs
In WoWLAN, devices may use crypto keys for TX/RX
and could also implement GTK rekeying. If the
driver isn't able to retrieve replay counters and
similar information from the device upon resume,
or if the device isn't responsive due to platform
issues, it isn't safe to keep the connection up
as GTK rekey messages from during the sleep time
could be replayed against it.
The only protection against that is disconnecting
from the AP. Modifying mac80211 to do that while
it is resuming would be very complex and invasive
in the case that the driver requires a reconfig,
so do it after it has resumed completely. In that
case, however, packets might be replayed since it
can then only happen after TX/RX are up again, so
mark keys for interfaces that need to disconnect
as "tainted" and drop all packets that are sent
or received with those keys.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
is_valid_ether_addr itself checks for is_zero_ether_addr
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This never, ever, happens.
Neighbour entries are always tied to one address family, and therefore
one set of dst_ops, and therefore one dst_ops->protocol "hh_type"
value.
This capability was blindly imported by Alexey Kuznetsov when he wrote
the neighbour layer.
Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of all of the useless and costly indirection
by doing the neigh hash table lookup directly inside
of the neighbour binding.
Rename from arp_bind_neighbour to rt_bind_neighbour.
Use new helpers {__,}ipv4_neigh_lookup()
In rt_bind_neighbour() get rid of useless tests which
are never true in the context this function is called,
namely dev is never NULL and the dst->neighbour is
always NULL.
Signed-off-by: David S. Miller <davem@davemloft.net>
Almost all of these have long outstayed their welcome.
And for every one of these macros, there are 10 features for which we
didn't add macros.
Let's just delete them all, and get out of habit of doing things this
way.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
common dprint_status() macro is used in all callbacks but not in call_decode()
Signed-off-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Because struct rpcbind_args *map was declared static, if two
threads entered this method at the same time, the values
assigned to map could be sent two two differen tasks.
This could cause all sorts of problems, include use-after-free
and double-free of memory.
Fix this by removing the static declaration so that the map
pointer is on the stack.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We currently can free inetpeer entries too early :
[ 782.636674] WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (f130f44c)
[ 782.636677] 1f7b13c100000000000000000000000002000000000000000000000000000000
[ 782.636686] i i i i u u u u i i i i u u u u i i i i u u u u u u u u u u u u
[ 782.636694] ^
[ 782.636696]
[ 782.636698] Pid: 4638, comm: ssh Not tainted 3.0.0-rc5+ #270 Hewlett-Packard HP Compaq 6005 Pro SFF PC/3047h
[ 782.636702] EIP: 0060:[<c13fefbb>] EFLAGS: 00010286 CPU: 0
[ 782.636707] EIP is at inet_getpeer+0x25b/0x5a0
[ 782.636709] EAX: 00000002 EBX: 00010080 ECX: f130f3c0 EDX: f0209d30
[ 782.636711] ESI: 0000bc87 EDI: 0000ea60 EBP: f0209ddc ESP: c173134c
[ 782.636712] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 782.636714] CR0: 8005003b CR2: f0beca80 CR3: 30246000 CR4: 000006d0
[ 782.636716] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 782.636717] DR6: ffff4ff0 DR7: 00000400
[ 782.636718] [<c13fbf76>] rt_set_nexthop.clone.45+0x56/0x220
[ 782.636722] [<c13fc449>] __ip_route_output_key+0x309/0x860
[ 782.636724] [<c141dc54>] tcp_v4_connect+0x124/0x450
[ 782.636728] [<c142ce43>] inet_stream_connect+0xa3/0x270
[ 782.636731] [<c13a8da1>] sys_connect+0xa1/0xb0
[ 782.636733] [<c13a99dd>] sys_socketcall+0x25d/0x2a0
[ 782.636736] [<c149deb8>] sysenter_do_call+0x12/0x28
[ 782.636738] [<ffffffff>] 0xffffffff
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't have multiple RX queues, so there's no use
in allocating multiple, use alloc_netdev_mqs() to
allocate multiple TX but only one RX queue.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
mac80211 maintains a running average of the RSSI when a STA
is associated to an AP. Report threshold events to any driver
that has registered callbacks for getting RSSI measurements.
Implement callbacks in mac80211 so that driver can set thresholds.
Add callbacks in mac80211 which is invoked when an RSSI threshold
event occurs.
mac80211: add tracing to rssi_reports api and remove extraneous fn argument
mac80211: scale up rssi thresholds from driver by 16 before storing
Signed-off-by: Meenakshi Venkataraman <meenakshi.venkataraman@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
And mask the hash function result by simply shifting
down the "->hash_shift" most significant bits.
Currently which bits we use is arbitrary since jhash
produces entropy evenly across the whole hash function
result.
But soon we'll be using universal hashing functions,
and in those cases more entropy exists in the higher
bits than the lower bits, because they use multiplies.
Signed-off-by: David S. Miller <davem@davemloft.net>
There can 3 reasons for the "command reject" reply produced
by the stack. Each such reply should be accompanied by the
relevand data ( as defined in spec. ). Currently there is one
instance of "command reject" reply with reason "invalid cid"
wich is fixed. Also, added clean-up definitions related to the
"command reject" replies.
Signed-off-by: Ilia Kolomisnky <iliak@ti.com>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This will be useful when userspace wants to restrict some kinds of
operations based on the length of the key size used to encrypt the
link.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
In some cases it will be useful having the key size used for
encrypting the link. For example, some profiles may restrict
some operations depending on the key length.
The key size is stored in the key that is passed to userspace
using the pin_length field in the key structure.
For now this field is only valid for LE controllers. 3.0+HS
controllers define the Read Encryption Key Size command, this
field is intended for storing the value returned by that
command.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
As the key format has changed to something that has a dynamic size,
the way that keys are received and sent must be changed.
The structure fields order is changed to make the parsing of the
information received from the Management Interface easier.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Now that it's possible that the exchanged key is present in
the link key list, we may be able to estabilish security with
an already existing key, without need to perform any SMP
procedure.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
With this we can use only one place to store all keys, without
need to use a field in the connection structure for this
purpose.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Now when the LTK is received from the remote or generated it is stored,
so it can later be used.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Before implementing SM key distribution, the pairing features
exchange must be better negotiated, taking into account some
features of the host and connection requirements.
If we are in the "not pairable" state, it makes no sense to
exchange any key. This allows for simplification of the key
negociation method.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Now that we have methods to finding keys by its parameters we can
reject an encryption request if the key isn't found.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
As the LTK (the new type of key being handled now) has more data
associated with it, we need to store this extra data and retrieve
the keys based on that data.
Methods for searching for a key and for adding a new LTK are
introduced here.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Since ca5ecddf (rcu: define __rcu address space modifier for sparse)
rcu_dereference_check use rcu_read_lock_held as a part of condition
automatically so callers do not have to do that as well.
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This adds support for generating and distributing all the keys
specified in the third phase of SMP.
This will make possible to re-establish secure connections, resolve
private addresses and sign commands.
For now, the values generated are random.
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Trigger user ABORT if application closes a socket which has data
queued on the socket receive queue or chunks waiting on the
reassembly or ordering queue as this would imply data being lost
which defeats the point of a graceful shutdown.
This behavior is already practiced in TCP.
We do not check the input queue because that would mean to parse
all chunks on it to look for unacknowledged data which seems too
much of an effort. Control chunks or duplicated chunks may also
be in the input queue and should not be stopping a graceful
shutdown.
Signed-off-by: Thomas Graf <tgraf@infradead.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to release "dcb_lock" which we took on the previous line.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upon "ip xfrm state update ..", xfrm_add_sa() takes an extra reference on
the user-supplied SA and forgets to drop the reference when
xfrm_state_update() returns 0. This leads to a memory leak as the
parameter SA is never freed. This change attempts to fix the leak by
calling __xfrm_state_put() when xfrm_state_update() updates a valid SA
(err = 0). The parameter SA is added to the gc list when the final
reference is dropped by xfrm_add_sa() upon completion.
Signed-off-by: Tushar Gohad <tgohad@mvista.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we try to stop a scheduled scan while it is not running, we should
return -ENOENT instead of simply ignoring the command and returning
success. This is more consistent with other parts of the code.
Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
A panic was observed when the device is failed to resume properly,
and there are no running interfaces. ieee80211_reconfig tries
to restart STA timers on unassociated state.
Cc: stable@kernel.org
Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In order to support pre-populating the P1K cache in
iwlwifi hardware for WoWLAN, we need to calculate
the P1K for the current IV32. Allow drivers to get
the P1K for any given IV32 instead of for a given
packet, but keep the packet-based version around as
an inline.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In order to implement GTK rekeying, the device needs
to be able to encrypt frames with the right PN/IV and
check the PN/IV in RX frames. To be able to tell it
about all those counters, we need to be able to get
them from mac80211, this adds the required API.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The current rx->queue value is slightly confusing.
It is set to 16 on non-QoS frames, including data,
and then used for sequence number and PN/IV checks.
Until recently, we had a TKIP IV checking bug that
had been introduced in 2008 to fix a seqno issue.
Before that, we always used TID 0 for checking the
PN or IV on non-QoS packets.
Go back to the old status for PN/IV checks using
the TID 0 counter for non-QoS by splitting up the
rx->queue value into "seqno_idx" and "security_idx"
in order to avoid confusion in the future. They
each have special rules on the value used for non-
QoS data frames.
Since the handling is now unified, also revert the
special TKIP handling from my patch
"mac80211: fix TKIP replay vulnerability".
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
mac80211 has a defnition of AES_BLOCK_SIZE and
multiple definitions of AES_BLOCK_LEN. Remove
them all and use crypto/aes.h.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Just like TKIP and CCMP, CMAC has the PN race.
It might not actually be possible to hit it now
since there aren't multiple ACs for management
frames, but fix it anyway.
Also move scratch buffers onto the stack.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Since we can process multiple packets at the
same time for different ACs, but the PN is
allocated from a single counter, we need to
use an atomic value there. Use atomic64_t to
make this cheaper on 64-bit platforms, other
platforms will support this through software
emulation, see lib/atomic64.c.
We also need to use an on-stack scratch buf
so that multiple packets won't corrupt each
others scratch buffers.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Our current TKIP code races against itself on TX
since we can process multiple packets at the same
time on different ACs, but they all share the TX
context for TKIP. This can lead to bad IVs etc.
Also, the crypto offload helper code just obtains
the P1K/P2K from the cache, and can update it as
well, but there's no guarantee that packets are
really processed in order.
To fix these issues, first introduce a spinlock
that will protect the IV16/IV32 values in the TX
context. This first step makes sure that we don't
assign the same IV multiple times or get confused
in other ways.
Secondly, change the way the P1K cache works. I
add a field "p1k_iv32" that stores the value of
the IV32 when the P1K was last recomputed, and
if different from the last time, then a new P1K
is recomputed. This can cause the P1K computation
to flip back and forth if packets are processed
out of order. All this also happens under the new
spinlock.
Finally, because there are argument differences,
split up the ieee80211_get_tkip_key() API into
ieee80211_get_tkip_p1k() and ieee80211_get_tkip_p2k()
and give them the correct arguments.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Since rpc_killall_tasks may modify the rpc_task's tk_action field
without any locking, we need to be careful when dereferencing it.
Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
When initiating a graceful shutdown while having data chunks
on the retransmission queue with a peer which is in zero
window mode the shutdown is never completed because the
retransmission error count is reset periodically by the
following two rules:
- Do not timeout association while doing zero window probe.
- Reset overall error count when a heartbeat request has
been acknowledged.
The graceful shutdown will wait for all outstanding TSN to
be acknowledged before sending the SHUTDOWN request. This
never happens due to the peer's zero window not acknowledging
the continuously retransmitted data chunks. Although the
error counter is incremented for each failed retransmission,
the receiving of the SACK announcing the zero window clears
the error count again immediately. Also heartbeat requests
continue to be sent periodically. The peer acknowledges these
requests causing the error counter to be reset as well.
This patch changes behaviour to only reset the overall error
counter for the above rules while not in shutdown. After
reaching the maximum number of retransmission attempts, the
T5 shutdown guard timer is scheduled to give the receiver
some additional time to recover. The timer is stopped as soon
as the receiver acknowledges any data.
The issue can be easily reproduced by establishing a sctp
association over the loopback device, constantly queueing
data at the sender while not reading any at the receiver.
Wait for the window to reach zero, then initiate a shutdown
by killing both processes simultaneously. The association
will never be freed and the chunks on the retransmission
queue will be retransmitted indefinitely.
Signed-off-by: Thomas Graf <tgraf@infradead.org>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (31 commits)
sctp: fix missing send up SCTP_SENDER_DRY_EVENT when subscribe it
net: refine {udp|tcp|sctp}_mem limits
vmxnet3: round down # of queues to power of two
net: sh_eth: fix the parameter for the ETHER of SH7757
net: sh_eth: fix cannot work half-duplex mode
net: vlan: enable soft features regardless of underlying device
vmxnet3: fix starving rx ring whenoc_skb kb fails
bridge: Always flood broadcast packets
greth: greth_set_mac_add would corrupt the MAC address.
net: bind() fix error return on wrong address family
natsemi: silence dma-debug warnings
net: 8139too: Initial necessary vlan_features to support vlan
Fix call trace when interrupts are disabled while sleeping function kzalloc is called
qlge:Version change to v1.00.00.29
qlge: Fix printk priority so chip fatal errors are always reported.
qlge:Fix crash caused by mailbox execution on wedged chip.
xfrm4: Don't call icmp_send on local error
ipv4: Don't use ufo handling on later transformed packets
xfrm: Remove family arg from xfrm_bundle_ok
ipv6: Don't put artificial limit on routing table size.
...
The ERTM receive buffer is now handled in a way that does not require
the busy queue and the associated polling code.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This change moves most L2CAP ERTM receive buffer handling out of the
L2CAP core and in to the socket code. It's up to the higher layer
(the socket code, in this case) to tell the core when its buffer is
full or has space available. The recv op should always accept
incoming ERTM data or else the connection will go down.
Within the socket layer, an skb that does not fit in the socket
receive buffer will be temporarily stored. When the socket is read
from, that skb will be placed in the receive buffer if possible. Once
adequate buffer space becomes available, the L2CAP core is informed
and the ERTM local busy state is cleared.
Receive buffer management for non-ERTM modes is unchanged.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
The local busy state is entered and exited based on buffer status in
the socket layer (or other upper layer). This change is in
preparation for general buffer status reports from the socket layer,
which will then be used to change the local busy status.
Signed-off-by: Mat Martineau <mathewm@codeaurora.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Unlike CCMP, the presence or absence of the QoS
field doesn't change the encryption, only the
TID is used. When no QoS field is present, zero
is used as the TID value. This means that it is
possible for an attacker to take a QoS packet
with TID 0 and replay it as a non-QoS packet.
Unfortunately, mac80211 uses different IVs for
checking the validity of the packet's TKIP IV
when it checks TID 0 and when it checks non-QoS
packets. This means it is vulnerable to this
replay attack.
To fix this, use the same replay counter for
TID 0 and non-QoS packets by overriding the
rx->queue value to 0 if it is 16 (non-QoS).
This is a minimal fix for now. I caused this
issue in
commit 1411f9b531
Author: Johannes Berg <johannes@sipsolutions.net>
Date: Thu Jul 10 10:11:02 2008 +0200
mac80211: fix RX sequence number check
while fixing a sequence number issue (there,
a separate counter needs to be used).
Cc: stable@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We were not allocating memory for the IEs passed in the scheduled_scan
request and this was causing memory corruption (buffer overflow).
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In case of tt_crc mismatching for a certain orig_node after applying the
changes, the node must request the full table immediately.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
To keep consistency of other originator tables, new clients detected as
roamed, are kept in the global table but are marked as TT_CLIENT_PENDING
They are purged only when the new ttvn is received by the corresponding
originator. Moreover they need to be considered as removed in case of global
transtable lookup.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
To keep transtable consistency among all the nodes, an originator must
not send not yet announced clients within a full table TT_RESPONSE.
Instead, deleted client have to be kept in the table in order to be sent
within an immediate TT_RESPONSE. In this way all the nodes in the
network will always provide the same response for the same request.
All the modification are committed at the next ttvn increment event.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
The last_ttvn and tt_crc fields of the orig_node structure were not
initialised causing an immediate TT_REQ/RES dialogue even if not needed.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
af_packet.c:(.text+0x3d130): undefined reference to `ip_defrag'
or
ERROR: "ip_defrag" [net/packet/af_packet.ko] undefined!
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fanout_add() might return with fanout_mutex held.
Reduce indentation level while we are at it
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds userspace buffers support in skb shared info. A new
struct skb_ubuf_info is needed to maintain the userspace buffers
argument and index, a callback is used to notify userspace to release
the buffers once lower device has done DMA (Last reference to that skb
has gone).
If there is any userspace apps to reference these userspace buffers,
then these userspaces buffers will be copied into kernel. This way we
can prevent userspace apps from holding these userspace buffers too long.
Use destructor_arg to point to the userspace buffer info; a new tx flags
SKBTX_DEV_ZEROCOPY is added for zero-copy buffer check.
Signed-off-by: Shirley Ma <xma@...ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RFC 6164 requires that routers MUST disable Subnet-Router anycast
for the prefix when /127 prefixes are used.
No need for matching code in addrconf_leave_anycast() as it
will silently ignore any attempt to leave an unknown anycast
address.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
We forgot to send up SCTP_SENDER_DRY_EVENT notification when
user app subscribes to this event, and there is no data to be
sent or retransmit.
This is required by the Socket API and used by the DTLS/SCTP
implementation.
Reported-by: Michael Tüxen <Michael.Tuexen@lurchi.franken.de>
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Tested-by: Robin Seggelmann <seggelmann@fh-muenster.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current tcp/udp/sctp global memory limits are not taking into account
hugepages allocations, and allow 50% of ram to be used by buffers of a
single protocol [ not counting space used by sockets / inodes ...]
Lets use nr_free_buffer_pages() and allow a default of 1/8 of kernel ram
per protocol, and a minimum of 128 pages.
Heavy duty machines sysadmins probably need to tweak limits anyway.
References: https://bugzilla.stlinux.com/show_bug.cgi?id=38032
Reported-by: starlight <starlight@binnacle.cx>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The enable_smp parameter is no longer needed. It can be replaced by
checking lmp_host_le_capable.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
Since we have the extended LMP features properly implemented, we
should check the LMP_HOST_LE bit to know if the host supports LE.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch adds a new module parameter to enable/disable host LE
support. By default host LE support is disabled.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This patch adds a handler to Write LE Host Supported command complete
events. Once this commands has completed successfully, we should
read the extended LMP features and update the extfeatures field in
hci_dev.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This new field holds the extended LMP features value. Some LE
mechanism such as discovery procedure needs to read the extended
LMP features to work properly.
Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo F. Padovan <padovan@profusion.mobi>
This adds the necessary mac80211 APIs to support
GTK rekey offload, mirroring the functionality
from cfg80211.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In certain circumstances, like WoWLAN scenarios,
devices may implement (partial) GTK rekeying on
the device to avoid waking up the host for it.
In order to successfully go through GTK rekeying,
the KEK, KCK and the replay counter are required.
Add API to let the supplicant hand the parameters
to the driver which may store it for future GTK
rekey operations.
Note that, of course, if GTK rekeying is done by
the device, the EAP frame must not be passed up
to userspace, instead a rekey event needs to be
sent to let userspace update its replay counter.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When in suspend/wowlan, devices might implement crypto
offload differently (more features), and might require
reprogramming keys for the WoWLAN (as it is the case
for Intel devices that use another uCode image). Thus
allow the driver to iterate all keys in this context.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When we clone the SKB, we forget about the original
one. Avoid this problem by using skb_share_check().
Reported-by: Penttilä Mika <mika.penttila@ixonos.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Unfortunately we have to use a real modulus here as
the multiply trick won't work as effectively with cpu
numbers as it does with rxhash values.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch add an unsolicited notification of the DCBX negotiated
parameters for the CEE flavor of the DCBX protocol. The notification
message is identical to the aggregated CEE get operation and holds all
the pertinent local and peer information. The notification routine is
exported so it can be invoked by drivers supporting an embedded DCBX
stack.
Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following couple of patches add dcbnl an unsolicited notification of
the the DCB configuration for the CEE flavor of the DCBX protocol. This
is useful when the user-mode DCB client is not responsible for
conducting and resolving the DCBX negotiation (either because the DCBX
stack is embedded in the HW or the negotiation is handled by another
agent in the host), but still needs to get the negotiated parameters.
This functionality already exists for the IEEE flavor of the DCBX
protocol and these patches add it to the older CEE flavor.
The first patch extends the CEE attribute GET operation to include not
only the peer information, but also all the pertinent local
configuration (negotiated parameters). The second patch adds and export
a CEE specific notification routine.
Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>