One of the problem with sock memory accounting is it uses
a pair of sock_hold()/sock_put() for each transmitted packet.
This slows down bidirectional flows because the receive path
also needs to take a refcount on socket and might use a different
cpu than transmit path or transmit completion path. So these
two atomic operations also trigger cache line bounces.
We can see this in tx or tx/rx workloads (media gateways for example),
where sock_wfree() can be in top five functions in profiles.
We use this sock_hold()/sock_put() so that sock freeing
is delayed until all tx packets are completed.
As we also update sk_wmem_alloc, we could offset sk_wmem_alloc
by one unit at init time, until sk_free() is called.
Once sk_free() is called, we atomic_dec_and_test(sk_wmem_alloc)
to decrement initial offset and atomicaly check if any packets
are in flight.
skb_set_owner_w() doesnt call sock_hold() anymore
sock_wfree() doesnt call sock_put() anymore, but check if sk_wmem_alloc
reached 0 to perform the final freeing.
Drawback is that a skb->truesize error could lead to unfreeable sockets, or
even worse, prematurely calling __sk_free() on a live socket.
Nice speedups on SMP. tbench for example, going from 2691 MB/s to 2711 MB/s
on my 8 cpu dev machine, even if tbench was not really hitting sk_refcnt
contention point. 5 % speedup on a UDP transmit workload (depends
on number of flows), lowering TX completion cpu usage.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to handle powersave frames properly we had needed
to pass these out to the device queues again, and introduce
the skb->requeue bit. This, however, also has unnecessary
overhead by needing to 'clean up' already tried frames, and
this clean-up code is also buggy when software encryption
is used.
Instead of sending the frames via the master netdev queue
again, simply put them into the pending queue. This also
fixes a problem where frames for that particular station
could be reordered when some were still on the software
queues and older ones are re-injected into the software
queue after them.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add a netlink interface for configuration of IEEE 802.15.4 device. Also this
interface specifies events notification sent by devices towards higher layers.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Sergey Lapin <slapin@ossfans.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for communication over IEEE 802.15.4 networks. This implementation
is neither certified nor complete, but aims to that goal. This commit contains
only the socket interface for communication over IEEE 802.15.4 networks.
One can either send RAW datagrams or use SOCK_DGRAM to encapsulate data
inside normal IEEE 802.15.4 packets.
Configuration interface, drivers and software MAC 802.15.4 implementation will
follow.
Initial implementation was done by Maxim Gorbachyov, Maxim Osipov and Pavel
Smolensky as a research project at Siemens AG. Later the stack was heavily
reworked to better suit the linux networking model, and is now maitained
as an open project partially sponsored by Siemens.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Signed-off-by: Sergey Lapin <slapin@ossfans.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change PSCHED_SHIFT from 10 to 6 to increase schedulers time
resolution. This will increase 16x a number of (internal) ticks per
nanosecond, and is needed to improve accuracy of schedulers based on
rate tables, like HTB, TBF or CBQ, with rates above 100Mbit. It is
assumed this change is safe for 32bit accounting of time diffs up
to 2 minutes, which should be enough for common use (extremely low
rate values may overflow, so get inaccurate instead). To make full
use of this change an updated iproute2 will be needed. (But using
older iproute2 should be safe too.)
This change breaks ticks - microseconds similarity, so some minor code
fixes might be needed. It is also planned to change naming adequately
eg. to PSCHED_TICKS2NS() etc. in the near future.
Reported-by: Antonio Almeida <vexwek@gmail.com>
Tested-by: Antonio Almeida <vexwek@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use PSCHED_SHIFT constant instead of '10' in PSCHED_US2NS() and
PSCHED_NS2US() macros to enable changing this value later.
Additionally use PSCHED_SHIFT in sch_hfsc SM_SHIFT and ISM_SHIFT
definitions. This part of the patch is based on feedback from
Patrick McHardy <kaber@trash.net>.
Reported-by: Antonio Almeida <vexwek@gmail.com>
Tested-by: Antonio Almeida <vexwek@gmail.com>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Furthermore, it twiddles with the details of SKB list handling
directly, which we're trying to eliminate.
Signed-off-by: David S. Miller <davem@davemloft.net>
To be easier on drivers and users, have cfg80211 register an
rfkill structure that drivers can access. When soft-killed,
simply take down all interfaces; when hard-killed the driver
needs to notify us and we will take down the interfaces
after the fact. While rfkilled, interfaces cannot be set UP.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch introduces new cfg80211 API to set the TX power
via cfg80211, puts the wext code into cfg80211 and updates
mac80211 to use all that. The -ENETDOWN bits are a hack but
will go away soon.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch completely rewrites the rfkill core to address
the following deficiencies:
* all rfkill drivers need to implement polling where necessary
rather than having one central implementation
* updating the rfkill state cannot be done from arbitrary
contexts, forcing drivers to use schedule_work and requiring
lots of code
* rfkill drivers need to keep track of soft/hard blocked
internally -- the core should do this
* the rfkill API has many unexpected quirks, for example being
asymmetric wrt. alloc/free and register/unregister
* rfkill can call back into a driver from within a function the
driver called -- this is prone to deadlocks and generally
should be avoided
* rfkill-input pointlessly is a separate module
* drivers need to #ifdef rfkill functions (unless they want to
depend on or select RFKILL) -- rfkill should provide inlines
that do nothing if it isn't compiled in
* the rfkill structure is not opaque -- drivers need to initialise
it correctly (lots of sanity checking code required) -- instead
force drivers to pass the right variables to rfkill_alloc()
* the documentation is hard to read because it always assumes the
reader is completely clueless and contains way TOO MANY CAPS
* the rfkill code needlessly uses a lot of locks and atomic
operations in locked sections
* fix LED trigger to actually change the LED when the radio state
changes -- this wasn't done before
Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> [thinkpad]
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo has updated the driver to no longer use the change flag,
so we can remove that, but rt2x00 and ath5k still use the
actual value so let's mark it as deprecated too.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Prior implementation of the new sctp_connectx() call that returns
an association ID did not work correctly on non-blocking socket.
This is because we could not return both a EINPROGRESS error and
an association id. This is a new implementation that supports this.
Originally from Ivan Skytte Jørgensen <isj-sctp@i1.dk
Signed-off-by: Ivan Skytte Jørgensen <isj-sctp@i1.dk
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
RFC 5061 Section 5.1 ASCONF Chunk Procedures said:
B4) Re-transmit the ASCONF Chunk last sent and if possible choose an
alternate destination address (please refer to [RFC4960],
Section 6.4.1). An endpoint MUST NOT add new parameters to this
chunk; it MUST be the same (including its Sequence Number) as
the last ASCONF sent. An endpoint MAY, however, bundle an
additional ASCONF with new ASCONF parameters with the next
Sequence Number. For details, see Section 5.5.
This patch fix to choose an alternate destination address when
re-transmit the ASCONF chunk, with some dup codes cleanup.
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Define three accessors to get/set dst attached to a skb
struct dst_entry *skb_dst(const struct sk_buff *skb)
void skb_dst_set(struct sk_buff *skb, struct dst_entry *dst)
void skb_dst_drop(struct sk_buff *skb)
This one should replace occurrences of :
dst_release(skb->dst)
skb->dst = NULL;
Delete skb->dst field
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb
Delete skb->rtable field
Setting rtable is not allowed, just set dst instead as rtable is an alias.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After some discussion offline with Christoph Lameter and David Stevens
regarding multicast behaviour in Linux, I'm submitting a slightly
modified patch from the one Christoph submitted earlier.
This patch provides a new socket option IP_MULTICAST_ALL.
In this case, default behaviour is _unchanged_ from the current
Linux standard. The socket option is set by default to provide
original behaviour. Sockets wishing to receive data only from
multicast groups they join explicitly will need to clear this
socket option.
Signed-off-by: Nivedita Singhvi <niv@us.ibm.com>
Signed-off-by: Christoph Lameter<cl@linux.com>
Acked-by: David Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The patch moves some utility functions from mac80211 to cfg80211.
Because these functions are doing generic 802.11 operations so they
are not mac80211 specific. The moving allows some fullmac drivers
to be also benefit from these utility functions.
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: Samuel Ortiz <samuel.ortiz@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This introduces genl_register_family_with_ops() that registers a genetlink
family along with operations from a table. This is used to kill copy'n'paste
occurrences in following patches.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
The netlink message header (struct nlmsghdr) is an unused parameter in
fill method of fib_rules_ops struct. This patch removes this
parameter from this method and fixes the places where this method is
called.
(include/net/fib_rules.h)
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Moving information from config_interface to bss_info_changed
removed struct ieee80211_if_conf which the documentation still
refers to, additionally there's one kernel-doc description too
much and one other missing, fix all this.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Some applications using wireless extensions expect to be able to
remove a key that doesn't exist. One example is wpa_supplicant
which doesn't actually change behaviour when running into an
error while trying to do that, but it prints an error message
which users interpret as wpa_supplicant having problems.
The safe thing to do is not change the behaviour of wireless
extensions any more, so when the driver reports -ENOENT let
the wext bridge code return success to userspace. To guarantee
this, also document that drivers should return -ENOENT when the
key doesn't exist.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This allows drivers to use a const pointer as the privid without a cast.
Signed-off-by: David Kilroy <kilroyd@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This allows drivers to mark their cfg80211_ops tables const.
Signed-off-by: David Kilroy <kilroyd@googlemail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This is more consistent with our nl80211 naming convention
for HT40-/+.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We are not correctly listening to the regulatory max bandwidth
settings. To actually make use of it we need to redesign things
a bit. This patch does the work for that. We do this to so we
can obey to regulatory rules accordingly for use of HT40.
We end up dealing with HT40 by having two passes for each channel.
The first check will see if a 20 MHz channel fits into the channel's
center freq on a given frequency range. We check for a 20 MHz
banwidth channel as that is the maximum an individual channel
will use, at least for now. The first pass will go ahead and
check if the regulatory rule for that given center of frequency
allows 40 MHz bandwidths and we use this to determine whether
or not the channel supports HT40 or not. So to support HT40 you'll
need at a regulatory rule that allows you to use 40 MHz channels
but you're channel must also be enabled and support 20 MHz by itself.
The second pass is done after we do the regulatory checks over
an device's supported channel list. On each channel we'll check
if the control channel and the extension both:
o exist
o are enabled
o regulatory allows 40 MHz bandwidth on its frequency range
This work allows allows us to idependently check for HT40- and
HT40+.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
be sent periodically. The rs_delay can be speficied when adding the
PRL entry and defaults to 15 minutes.
The RS is sent from every link local adress that's assigned to the
tunnel interface. It's directed to the (guessed) linklocal address
of the router and is sent through the tunnel.
Better: send to ff02::2 encapsuled in unicast directed to router-v4.
Signed-off-by: Sascha Hlusiak <contact@saschahlusiak.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new NL80211_ATTR_CONTROL_PORT flag for NL80211_CMD_ASSOCIATE to
allow user space to indicate that it will control the IEEE 802.1X port
in station mode. Previously, mac80211 was always marking the port
authorized in station mode. This was enough when drop_unencrypted flag
was set. However, drop_unencrypted can currently be controlled only
with WEXT and the current nl80211 design does not allow fully secure
configuration. Fix this by providing a mechanism for user space to
control the IEEE 802.1X port in station mode (i.e., do the same that
we are already doing in AP mode).
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
It is currently not possible to modify station flags, but that
capability would be very useful. This patch introduces a new
nl80211 attribute that contains a set/mask for station flags,
and updates the internal API (and mac80211) to mirror that.
The new attribute is parsed before falling back to the old so
that userspace can specify both (if it can) to work on all
kernels.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Move key handling wireless extension ioctls from mac80211 to cfg80211
so that all drivers that implement the cfg80211 operations get wext
compatibility.
Note that this drops the SIOCGIWENCODE ioctl support for getting
IW_ENCODE_RESTRICTED/IW_ENCODE_OPEN. This means that iwconfig will
no longer report "Security mode:open" or "Security mode:restricted"
for mac80211. However, what we displayed there (the authentication
algo used) was actually wrong -- linux/wireless.h states that this
setting is meant to differentiate between "Refuse non-encoded packets"
and "Accept non-encoded packets".
(Combined with "cfg80211: fix a couple of bugs with key ioctls". -- JWL)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
To make it more apparent in the code what is for wext
only (and needs to be #ifdef'ed) put all the info for
wext into a substruct in each wireless_dev.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The address pointed to by mac_addr can be marked as const.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This struct is no longer used (and hasn't been for a while).
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There really is no need to have a separate struct for a
single variable. The fact that it exists is due to the
code legacy, but we can remove that now. Very simple.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
NL80211_CMD_ASSOCIATE request must be able to indicate whether
management frame protection (IEEE 802.11w) is being used. mac80211 was
able to use MFP in client mode only with WEXT, but the new
NL80211_ATTR_USE_MFP attribute will allow this to be done with
nl80211, too.
Since we are currently using nl80211 for MFP only with drivers that
use user space SME, only MFP disabled and required values are
used. However, the NL80211_ATTR_USE_MFP attribute is an enum that can
be extended with MFP optional in the future, if that is needed with
some drivers (e.g., if the RSN IE is generated by the driver).
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We can avoid waking up tasks not interested in receive notifications,
using wake_up_interruptible_poll() instead of wake_up_interruptible()
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Small cleanup patch to reduce line lengths, before a change in
tcp_prequeue().
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we aren't doing anything in mac80211, we can turn off
much of the hardware, depending on the driver/hw. Not doing
anything, aka being idle, means:
* no monitor interfaces
* no AP/mesh/wds interfaces
* any station interfaces are in DISABLED state
* any IBSS interfaces aren't trying to be in a network
* we aren't trying to scan
By creating a new function that verifies these conditions and calling
it at strategic points where the states of those conditions change,
we can easily make mac80211 tell the driver when we are idle to save
power.
Additionally, this fixes a small quirk where a recalculated powersave
state is passed to the driver even if the hardware is about to stopped
completely.
This patch intentionally doesn't touch radio_enabled because that is
currently implemented to be a soft rfkill which is inappropriate here
when we need to be able to wake up with low latency.
One thing I'm not entirely sure about is this:
phy0: device no longer idle - in use
wlan0: direct probe to AP 00:11:24:91:07:4d try 1
wlan0 direct probe responded
wlan0: authenticate with AP 00:11:24:91:07:4d
wlan0: authenticated
> phy0: device now idle
> phy0: device no longer idle - in use
wlan0: associate with AP 00:11:24:91:07:4d
wlan0: RX AssocResp from 00:11:24:91:07:4d (capab=0x401 status=0 aid=1)
wlan0: associated
Is it appropriate to go into idle state for a short time when we have
just authenticated, but not associated yet? This happens only with the
userspace SME, because we cannot really know how long it will wait
before asking us to associate. Would going idle after a short timeout
be more appropriate? We may need to revisit this, depending on what
happens.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The config_interface method is a little strange, it contains the
BSSID and beacon updates, while bss_info_changed contains most
other BSS information for each interface. This patch removes
config_interface and rolls all the information it previously
passed to drivers into bss_info_changed.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We currently have two beacon interval configuration knobs:
hw.conf.beacon_int and vif.bss_info.beacon_int. This is
rather confusing, even though the former is used when we
beacon ourselves and the latter when we are associated to
an AP.
This just deprecates the hw.conf.beacon_int setting in favour
of always using vif.bss_info.beacon_int. Since it touches all
the beaconing IBSS code anyway, we can also add support for
the cfg80211 IBSS beacon interval configuration easily.
NOTE: The hw.conf.beacon_int setting is retained for now due
to drivers still using it -- I couldn't untangle all
drivers, some are updated in this patch.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Kalle points out that max_sleep_interval is somewhat confusing
because the value is measured in beacon intervals, and not in
TU. Rename it to max_sleep_period to be consistent with things
like DTIM period that are also measured in beacon intervals.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits)
e1000: fix virtualization bug
bonding: fix alb mode locking regression
Bluetooth: Fix issue with sysfs handling for connections
usbnet: CDC EEM support (v5)
tcp: Fix tcp_prequeue() to get correct rto_min value
ehea: fix invalid pointer access
ne2k-pci: Do not register device until initialized.
Subject: [PATCH] br2684: restore net_dev initialization
net: Only store high 16 bits of kernel generated filter priorities
virtio_net: Fix function name typo
virtio_net: Cleanup command queue scatterlist usage
bonding: correct the cleanup in bond_create()
virtio: add missing include to virtio_net.h
smsc95xx: add support for LAN9512 and LAN9514
smsc95xx: configure LED outputs
netconsole: take care of NETDEV_UNREGISTER event
xt_socket: checks for the state of nf_conntrack
bonding: bond_slave_info_query() fix
cxgb3: fixing gcc 4.4 compiler warning: suggest parentheses around operand of ‘!’
netfilter: use likely() in xt_info_rdlock_bh()
...
Due to a semantic changes in flush_workqueue() the current approach of
synchronizing the sysfs handling for connections doesn't work anymore. The
whole approach is actually fully broken and based on assumptions that are
no longer valid.
With the introduction of Simple Pairing support, the creation of low-level
ACL links got changed. This change invalidates the reason why in the past
two independent work queues have been used for adding/removing sysfs
devices. The adding of the actual sysfs device is now postponed until the
host controller successfully assigns an unique handle to that link. So
the real synchronization happens inside the controller and not the host.
The only left-over problem is that some internals of the sysfs device
handling are not initialized ahead of time. This leaves potential access
to invalid data and can cause various NULL pointer dereferences. To fix
this a new function makes sure that all sysfs details are initialized
when an connection attempt is made. The actual sysfs device is only
registered when the connection has been successfully established. To
avoid a race condition with the registration, the check if a device is
registered has been moved into the removal work.
As an extra protection two flush_work() calls are left in place to
make sure a previous add/del work has been completed first.
Based on a report by Marc Pignat <marc.pignat@hevs.ch>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Tested-by: Justin P. Mattock <justinmattock@gmail.com>
Tested-by: Roger Quadros <ext-roger.quadros@nokia.com>
Tested-by: Marc Pignat <marc.pignat@hevs.ch>
tcp_prequeue() refers to the constant value (TCP_RTO_MIN) regardless of
the actual value might be tuned. The following patches fix this and make
tcp_prequeue get the actual value returns from tcp_rto_min().
Signed-off-by: Satoru SATOH <satoru.satoh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>