Rate success probability usually fluctuates a lot under normal conditions.
With a simple EWMA, noise and fluctuation can be reduced by increasing the
window length, but that comes at the cost of introducing lag on sudden
changes.
This change replaces the EWMA implementation with a moving average that's
designed to significantly reduce lag while keeping a bigger window size
by being better at filtering out noise.
It is only slightly more expensive than the simple EWMA and still avoids
divisions in its calculation.
The algorithm is adapted from an implementation intended for a completely
different field (stock market trading), where the tradeoff of lag vs
noise filtering is equally important. It is based on the "smoothing filter"
from http://www.stockspotter.com/files/PredictiveIndicators.pdf.
I have adapted it to fixed-point math with some constants so that it uses
only addition, bit shifts and multiplication
To better make use of the filtering and bigger window size, the update
interval time is cut in half.
For testing, the algorithm can be reverted to the older one via debugfs
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20191008171139.96476-2-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Use a slightly different threshold for downgrading spatial streams to
make it easier to calculate without divisions.
Slightly reduces CPU overhead.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20191008171139.96476-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Currently the for-loop will spin forever if variable supported is
non-zero because supported is never changed. Fix this by adding in
the missing right shift of supported.
Addresses-Coverity: ("Infinite loop")
Fixes: 48cb39522a ("mac80211: minstrel_ht: improve rate probing for devices with static fallback")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20190822122034.28664-1-colin.king@canonical.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
On some devices that only support static rate fallback tables sending rate
control probing packets can be really expensive.
Probing lower rates can already hurt throughput quite a bit. What hurts even
more is the fact that on mt76x0/mt76x2, single probing packets can only be
forced by directing packets at a different internal hardware queue, which
causes some heavy reordering and extra latency.
The reordering issue is mainly problematic while pushing lots of packets to
a particular station. If there is little activity, the overhead of probing is
neglegible.
The static fallback behavior is designed to pretty much only handle rate
control algorithms that use only a very limited set of rates on which the
algorithm switches up/down based on packet error rate.
In order to better support that kind of hardware, this patch implements a
different approach to rate probing where it switches to a slightly higher rate,
waits for tx status feedback, then updates the stats and switches back to
the new max throughput rate. This only triggers above a packet rate of 100
per stats interval (~50ms).
For that kind of probing, the code has to reduce the set of probing rates
a lot more compared to single packet probing, so it uses only one packet
per MCS group which is either slightly faster, or as close as possible to
the max throughput rate.
This allows switching between similar rates with different numbers of
streams. The algorithm assumes that the hardware will work its way lower
within an MCS group in case of retransmissions, so that lower rates don't
have to be probed by the high packets per second rate probing code.
To further reduce the search space, it also does not probe rates with lower
channel bandwidth than the max throughput rate.
At the moment, these changes will only affect mt76x0/mt76x2.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20190820095449.45255-4-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
On hardware with static fallback tables (e.g. mt76x2), rate probing attempts
can be very expensive.
On such devices, avoid sampling rates slower than the per-group max throughput
rate, based on the assumption that the fallback table will take care of probing
lower rates within that group if the higher rates fail.
To further reduce unnecessary probing attempts, skip duplicate attempts on
rates slower than the max throughput rate.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20190820095449.45255-2-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The group number needs to be multiplied by the number of rates per group
to get the full rate index
Fixes: 5935839ad7 ("mac80211: improve minstrel_ht rate sorting by throughput & probability")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20190820095449.45255-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's no rate control algorithm that *doesn't* want to call
it internally, and calling it internally will let us modify
its behaviour in the future.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A per-group shift was added to reduce the size of the per-rate transmit
duration field to u16 without sacrificing a lot of precision
This patch changes the macros to automatically calculate the best value for
this shift based on the lowest rate within the group.
This simplifies adding more groups and slightly improves accuracy for some of
the existing groups.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This is needed for the upcoming driver for MT7615 4x4 802.11ac chipsets
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Some hardware (e.g. MediaTek MT7603) cannot report A-MPDU length in tx status
information. Add support for a flag to indicate that, to allow minstrel_ht
to use a fixed value in its internal calculation (which gives better results
than just defaulting to 1).
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
These rates are highly unlikely to be used quickly, even if the link
deteriorates rapidly. This improves throughput in cases where CCK rates
are not reliable enough to be skipped entirely during sampling.
Sampling these rates regularly can cost a lot of airtime.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Long/short preamble selection cannot be sampled separately, since it
depends on the BSS state. Because of that, sampling attempts to
currently not used preamble modes are not counted in the statistics,
which leads to CCK rates being sampled too often.
Fix statistics accounting for long/short preamble by increasing the
index where necessary.
Fix excessive CCK rate sampling by dropping unsupported sample attempts.
This improves throughput on 2.4 GHz channels
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fixes a harmless underflow issue when CCK rates are actively being used
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
mi->supported[MINSTREL_CCK_GROUP] needs to be updated
short preamble rates need to be marked as supported regardless of
whether it's currently enabled. Its state can change at any time without
a rate_update call.
Fixes: 782dda00ab ("mac80211: minstrel_ht: move short preamble check out of get_rate")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
By storing a shift value for all duration values of a group, we can
reduce precision by a neglegible amount to make it fit into a u16 value.
This improves cache footprint and reduces size:
Before:
text data bss dec hex filename
10024 116 0 10140 279c rc80211_minstrel_ht.o
After:
text data bss dec hex filename
9368 116 0 9484 250c rc80211_minstrel_ht.o
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Legacy-only devices are not very common and the overhead of the extra
code for HT and VHT rates is not big enough to justify all those extra
lines of code to make it optional.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
debugfs entries are cleaned up by debugfs_remove_recursive already.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If peer support reception of STBC and LDPC, enable them for better
performance.
Signed-off-by: Chaitanya TK <chaitanya.mgit@gmail.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Extracting the TID from the QOS header is common enough
to justify helper.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Rename .tx_status_noskb to .tx_status_ext and pass a new on-stack
struct ieee80211_tx_status instead of struct ieee80211_tx_info.
This struct can be used to pass extra information, e.g. for dynamic tx
power control
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This was added during early development when 3x3 hardware was not very
common yet. This is completely unnecessary now.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Test short preamble support in minstrel_ht_update_caps instead of
looking at the per-packet flag. Makes the code more efficient.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Improves dcache footprint by ensuring that fewer cache lines need to be
touched.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This enum is already perfectly aliased to enum nl80211_band, and
the only reason for it is that we get IEEE80211_NUM_BANDS out of
it. There's no really good reason to not declare the number of
bands in nl80211 though, so do that and remove the cfg80211 one.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Prevents excessive A-MSDU aggregation at low data rates or bad
conditions.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There were a few issues that were slowing down the process of finding
the optimal rate, especially on devices with multi-rate retry
limitations:
When max_tp_rate[0] was slower than max_tp_rate[1], the code did not
sample max_tp_rate[1], which would often allow it to switch places with
max_tp_rate[0] (e.g. if only the first sampling attempts were bad, but the
rate is otherwise good).
Also, sample attempts of rates between max_tp_rate[0] and [1] were being
ignored in this case, because the code only checked if the rate was
slower than [1].
Fix this by checking against the fastest / second fastest max_tp_rate
instead of assuming a specific order between the two.
In my tests this patch significantly reduces the time until minstrel_ht
finds the optimal rate right after assoc
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
RTS/CTS needs to be enabled if the rate is a fallback rate *or* if it's
a dual-stream rate and the sta is in dynamic SMPS mode.
Cc: stable@vger.kernel.org
Fixes: a3ebb4e1b7 ("mac80211: minstrel_ht: handle peers in dynamic SMPS")
Reported-by: Matías Richart <mrichart@fing.edu.uy>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The value 5000 was put here with the addition of the timeout field to
ieee80211_start_tx_ba_session. It was originally added in mac80211 to
save resources for drivers like iwlwifi, which only supports a limited
number of concurrent aggregation sessions.
Since iwlwifi does not use minstrel_ht and other drivers don't need
this, 0 is a better default - especially since there have been
recent reports of aggregation setup related issues reproduced with
ath9k. This should improve stability without causing any adverse
effects.
Cc: stable@vger.kernel.org
Acked-by: Avery Pennarun <apenwarr@gmail.com>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The change from cur_tp to the function
minstrel_get_tp_avg/minstrel_ht_get_tp_avg changed the unit used for the
current throughput. For example in minstrel_ht the correct
conversion between them would be:
mrs->cur_tp / 10 == minstrel_ht_get_tp_avg(..).
This factor 10 must also be included in the calculation of
minstrel_get_expected_throughput and minstrel_ht_get_expected_throughput to
return values with the unit [Kbps] instead of [10Kbps]. Otherwise routing
algorithms like B.A.T.M.A.N. V will make incorrect decision based on these
values. Its kernel based implementation expects expected_throughput always
to have the unit [Kbps] and not sometimes [10Kbps] and sometimes [Kbps].
The same requirement has iw or olsrdv2's nl80211 based statistics module
which retrieve the same data via NL80211_STA_INFO_TX_BITRATE.
Cc: stable@vger.kernel.org
Fixes: 6a27b2c40b ("mac80211: restructure per-rate throughput calculation into function")
Signed-off-by: Sven Eckelmann <sven@open-mesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Patch fixes this splat
BUG: KASAN: slab-out-of-bounds in minstrel_ht_update_stats.isra.7+0x6e1/0x9e0
[mac80211] at addr ffff8800cee640f4 Read of size 4 by task swapper/3/0
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Link: http://lkml.kernel.org/r/CALYGNiNyJhSaVnE35qS6UCGaSb2Dx1_i5HcRavuOX14oTz2P+w@mail.gmail.com
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In case of Dynamic SMPS enable RTS/CTS for all rates.
Signed-off-by: Chaitanya T K <chaitanya.mgit@gmail.com>
[change comment]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
As we're running out of hardware capability flags pretty quickly,
convert them to use the regular test_bit() style unsigned long
bitmaps.
This introduces a number of helper functions/macros to set and to
test the bits, along with new debugfs code.
The occurrences of an explicit __clear_bit() are intentional, the
drivers were never supposed to change their supported bits on the
fly. We should investigate changing this to be a per-frame flag.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch adds the new statistic "maximum possible lossless
throughput" to Minstrels and Minstrel-HTs rc_stats (in debugfs). This
enables comprehensive comparison between current per-rate throughput
and max. achievable per-rate throughput.
Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch moves Minstrels and Minstrel-HTs per-rate throughput
calculation (EWMA(thr)) into a dedicated function to be called.
Therefore the variable "unsigned int cur_tp" within struct
"minstrel_rate_stats" becomes obsolete. and is removed to free
up its space.
Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch ensures a consistent usage of variable names for type
"minstrel_rate_stats" to be used as "mrs" and from type minstrel_rate
as "mr" across both Minstrel & Minstrel-HT. In addition some
variable and function names got changed to more meaningful ones.
Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This patch unifies the calculation of Minstrels and Minstrel-HTs
per-rate statistic. The new common function minstrel_calc_rate_stats()
is called when a statistic update is performed.
Signed-off-by: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Acked-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
On very high MCS bitrates, the calculated duration of rates that are
next to each other can be very imprecise, due to the small packet size
used as reference (1200 bytes).
This is most visible in VHT80 nss=2 MCS8/9, for which minstrel shows the
same throughput when the probability is also the same. This leads to a
bad rate selection for such rates.
Fix this issue by introducing an average A-MPDU size factor into the
calculation.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Check the queue mapping earlier, skb->queue_mapping is more likely than
skb->data to still be in d-cache.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The commit 5935839ad7
"mac80211: improve minstrel_ht rate sorting by throughput & probability"
introduced a crash on rate sorting that occurs when the rate added to
the sorting array is faster than all the previous rates. Due to an
off-by-one error, it reads the rate index from tp_list[-1], which
contains uninitialized stack garbage, and then uses the resulting index
for accessing the group rate stats, leading to a crash if the garbage
value is big enough.
Cc: Thomas Huehn <thomas@net.t-labs.tu-berlin.de>
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When CONFIG_MAC80211_RC_MINSTREL_VHT is set, the module param
minstrel_vht_only tells minstrel_ht whether to allow the mix of ht rates
with vht rates.
ATM, minstrel_ht skips ht rates when minstrel_vht_only is true, but it does
that even if vht is not supported, which makes the sta rates fallback to
legacy as no ht rate gets enabled.
Fixes: 9208247d74 ("mac80211: minstrel_ht: add basic support for VHT rates <= 3SS@80MHz")
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When the new CONFIG_MAC80211_RC_MINSTREL_VHT is not set (default 'N'),
there is no behavioral change including in sampling and MCS_GROUP_RATES
remains 8.
Otherwise MCS_GROUP_RATES is 10, and a module parameter *vht_only*
(default 'true'), restricts the rates selection to VHT when VHT is
supported.
Regarding the debugfs stats buffer:
It is explicitly increased from 8k to 32k to fit every rates incl. when
both HT and VHT rates are enabled, as for the format, before:
type rate tpt eprob *prob ret *ok(*cum) ok( cum)
HT20/LGI ABCDP MCS0 0.0 0.0 0.0 1 0( 0) 0( 0)
after:
type rate tpt eprob *prob ret *ok(*cum) ok( cum)
HT20/LGI ABCDP MCS0 0.0 0.0 0.0 1 0( 0) 0( 0)
VHT40/LGI MCS5/2 0.0 0.0 0.0 0 0( 0) 0( 0)
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
ATM, we grep cck rates idx with idx / MCS_GROUP_RATES ==
MINSTREL_CCK_GROUP.
Matching neither-cck-non-ht rates could be done by replacing '==' with
'>', however it would be less versatile or explicit.
This will allow to match VHT rates with IEEE80211_TX_RC_VHT_MCS.
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>