The carry from the 64->32bits folding was dropped, e.g with:
saddr=0xFFFFFFFF daddr=0xFF0000FF len=0xFFFF proto=0 sum=1,
csum_tcpudp_nofold returned 0 instead of 1.
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported in: https://bugzilla.kernel.org/show_bug.cgi?id=92081
This patch avoids calling rtnl_notify if the device ndo_bridge_getlink
handler does not return any bytes in the skb.
Alternately, the skb->len check can be moved inside rtnl_notify.
For the bridge vlan case described in 92081, there is also a fix needed
in bridge driver to generate a proper notification. Will fix that in
subsequent patch.
v2: rebase patch on net tree
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Neal Cardwell says:
====================
fix stretch ACK bugs in TCP CUBIC and Reno
This patch series fixes the TCP CUBIC and Reno congestion control
modules to properly handle stretch ACKs in their respective additive
increase modes, and in the transitions from slow start to additive
increase.
This finishes the project started by commit 9f9843a751 ("tcp:
properly handle stretch acks in slow start"), which fixed behavior for
TCP congestion control when handling stretch ACKs in slow start mode.
Motivation: In the Jan 2015 netdev thread 'BW regression after "tcp:
refine TSO autosizing"', Eyal Perry documented a regression that Eric
Dumazet determined was caused by improper handling of TCP stretch
ACKs.
Background: LRO, GRO, delayed ACKs, and middleboxes can cause "stretch
ACKs" that cover more than the RFC-specified maximum of 2
packets. These stretch ACKs can cause serious performance shortfalls
in common congestion control algorithms, like Reno and CUBIC, which
were designed and tuned years ago with receiver hosts that were not
using LRO or GRO, and were instead ACKing every other packet.
Testing: at Google we have been using this approach for handling
stretch ACKs for CUBIC datacenter and Internet traffic for several
years, with good results.
v2:
* fixed return type of tcp_slow_start() to be u32 instead of int
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a bug in CUBIC that causes cwnd to increase slightly
too slowly when multiple ACKs arrive in the same jiffy.
If cwnd is supposed to increase at a rate of more than once per jiffy,
then CUBIC was sometimes too slow. Because the bic_target is
calculated for a future point in time, calculated with time in
jiffies, the cwnd can increase over the course of the jiffy while the
bic_target calculated as the proper CUBIC cwnd at time
t=tcp_time_stamp+rtt does not increase, because tcp_time_stamp only
increases on jiffy tick boundaries.
So since the cnt is set to:
ca->cnt = cwnd / (bic_target - cwnd);
as cwnd increases but bic_target does not increase due to jiffy
granularity, the cnt becomes too large, causing cwnd to increase
too slowly.
For example:
- suppose at the beginning of a jiffy, cwnd=40, bic_target=44
- so CUBIC sets:
ca->cnt = cwnd / (bic_target - cwnd) = 40 / (44 - 40) = 40/4 = 10
- suppose we get 10 acks, each for 1 segment, so tcp_cong_avoid_ai()
increases cwnd to 41
- so CUBIC sets:
ca->cnt = cwnd / (bic_target - cwnd) = 41 / (44 - 41) = 41 / 3 = 13
So now CUBIC will wait for 13 packets to be ACKed before increasing
cwnd to 42, insted of 10 as it should.
The fix is to avoid adjusting the slope (determined by ca->cnt)
multiple times within a jiffy, and instead skip to compute the Reno
cwnd, the "TCP friendliness" code path.
Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change CUBIC to properly handle stretch ACKs in additive increase mode
by passing in the count of ACKed packets to tcp_cong_avoid_ai().
In addition, because we are now precisely accounting for stretch ACKs,
including delayed ACKs, we can now remove the delayed ACK tracking and
estimation code that tracked recent delayed ACK behavior in
ca->delayed_ack.
Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change Reno to properly handle stretch ACKs in additive increase mode
by passing in the count of ACKed packets to tcp_cong_avoid_ai().
In addition, if snd_cwnd crosses snd_ssthresh during slow start
processing, and we then exit slow start mode, we need to carry over
any remaining "credit" for packets ACKed and apply that to additive
increase by passing this remaining "acked" count to
tcp_cong_avoid_ai().
Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_cong_avoid_ai() was too timid (snd_cwnd increased too slowly) on
"stretch ACKs" -- cases where the receiver ACKed more than 1 packet in
a single ACK. For example, suppose w is 10 and we get a stretch ACK
for 20 packets, so acked is 20. We ought to increase snd_cwnd by 2
(since acked/w = 20/10 = 2), but instead we were only increasing cwnd
by 1. This patch fixes that behavior.
Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LRO, GRO, delayed ACKs, and middleboxes can cause "stretch ACKs" that
cover more than the RFC-specified maximum of 2 packets. These stretch
ACKs can cause serious performance shortfalls in common congestion
control algorithms that were designed and tuned years ago with
receiver hosts that were not using LRO or GRO, and were instead
politely ACKing every other packet.
This patch series fixes Reno and CUBIC to handle stretch ACKs.
This patch prepares for the upcoming stretch ACK bug fix patches. It
adds an "acked" parameter to tcp_cong_avoid_ai() to allow for future
fixes to tcp_cong_avoid_ai() to correctly handle stretch ACKs, and
changes all congestion control algorithms to pass in 1 for the ACKed
count. It also changes tcp_slow_start() to return the number of packet
ACK "credits" that were not processed in slow start mode, and can be
processed by the congestion control module in additive increase mode.
In future patches we will fix tcp_cong_avoid_ai() to handle stretch
ACKs, and fix Reno and CUBIC handling of stretch ACKs in slow start
and additive increase mode.
Reported-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As of commit 9a1091ef00 ("irqchip: gic: Support hierarchy irq
domain."), the APE6EVM legacy board support is known to be broken.
The IRQ numbers of the GIC are now virtual, and no longer match the
hardcoded hardware IRQ numbers in the legacy platform board code.
To fix this issue specific to non-muliplatform r8a73a4 and APE6EVM:
1) Instantiate the GIC from platform board code and also
2) Skip over the DT arch timer as well as
3) Force delay setup based on DT CPU frequency
With these 3 fixes in place interrupts on APE6EVM are now unbroken.
Partially based on legacy GIC fix by Geert Uytterhoeven, thanks to
him for the initial work.
Signed-off-by: Magnus Damm <damm+renesas@opensource.se>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
375/38x. Only switch the PL310 to I/O coherent mode if I/O coherency
is enabled.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJUyRueAAoJEOa/DcumaUyEvqAP/AuTbrPtd7LeiZBvCfEBBs+C
bDT+TjksyMBMZEjQ4OFNQTj4NSLsL4RqfUrfTFkg6RJ0pjviD80saNkfKy8bByPR
WduY4AhSgks07YI79Nu8cKGUO/iEuxztnCZQWi5b5wNideY9z+Ta3k46Z26VfRwc
NCQsT2PLKpVmTnIlhi6ilLHsqRAwsV0An+swEyAZXVBAbfKpoWOHxrfR7wVO9oSa
3dFmnxFcF3/pavtIbL5wIHkGSsjlSi8sdCvieWdf186p8ubuV5TNyuRme0msbaAf
JBJCNorSepP9vNCb2cCaEOcq+/A/f3NZAX256zGHcyvg6B9Ntq324MVj0sZd+dSo
nAeJYQvgwD10HvZGj4kCAL11Fc45chnGoo1iPGuxvzF6nKI5liINMnAmU8VPhnZX
swL3M2k69T14QutS+FbEN/6RGOQAWHZoXQ5YxFwFqei2I7j7g0QJvemwrfFkTwQj
bPyOE2op6fPpgJyGedM5icU8KesCkORNuu0lRWT9v5rU51epwdna3lcProtuBOA2
fq/WXb0mC/poyToIEJJZHOLJU7jZy2D+WMKBpu3imiKYBv29qShT9Mah22msySfY
YY9luHCQMWuT5zVVNFA/SMa8sQbpfxbNngen3iW79WK3LHVP9B6I/dfrn3F/BXXq
qtYwwNYQqf9kuI0nyy1H
=qzYQ
-----END PGP SIGNATURE-----
Merge tag 'mvebu-fixes-3.19-6' of git://git.infradead.org/linux-mvebu into fixes
Merge "mvebu-fixes-6" from Andrew Lunn:
The previous fix for Armada XP, disabling I/O coherency, broke Armada
375/38x. Only switch the PL310 to I/O coherent mode if I/O coherency
is enabled.
* tag 'mvebu-fixes-3.19-6' of git://git.infradead.org/linux-mvebu:
ARM: mvebu: don't set the PL310 in I/O coherency mode when I/O coherency is disabled
Signed-off-by: Olof Johansson <olof@lixom.net>
When ak4114 work calls its callback and the callback invokes
ak4114_reinit(), it stalls due to flush_delayed_work(). For avoiding
this, control the reentrance by introducing a refcount. Also
flush_delayed_work() is replaced with cancel_delayed_work_sync().
The exactly same bug is present in ak4113.c and fixed as well.
Reported-by: Pavel Hofman <pavel.hofman@ivitera.com>
Acked-by: Jaroslav Kysela <perex@perex.cz>
Tested-by: Pavel Hofman <pavel.hofman@ivitera.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The core only supports up to 32 slaves, and the chipselect function
expects the same.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Refactor drms_uA_update() slightly to allow regulator_set_optimum_mode()
to utilize the same logic instead of duplicating it.
Signed-off-by: Bjorn Andersson <bjorn.andersson@sonymobile.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
"force_mode" is a u32 so it is never "< 0", but because of type
promotion then comparing "== -1" will do what we want.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Small transfers generally can be accomplished faster in polling mode.
This patch select the transfer which size is bellow the buffer size to
be done on polling mode
Suggested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The variable never leaves the scope of txrx_bufs.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Simplify the code by using the unit used on most of the code logic.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Simplify the code by using the unit used on most of the code logic.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
spi_rx handles the case where the buffer is null. Nevertheless spi_tx
did not handle it, and was handled by the caller function.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Simplify the code by removing the tx and and rx function pointers and
substitute them by a single function.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The core controls the chip select lines individually.
By default, all the lines are consider active_low. After
spi_setup_transfer, it has its real value.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
When no irq is used, there is no need to inhibit the transmission for
every transaction. This inhibition was implemented to avoid a race
condition with the irq handler.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The core can run in polling mode. In fact, the performance of the core
is similar (or even better), due to the fact most of the spi
transactions are just a couple of bytes and there is one irq per
transactions.
When an mtd device is connected via spi, reading 8MB of data produces
more than 80K interrupts (with irq disabling, context swith....)
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The control register has not changed since the previous access.
Therefore we can use the cached value and safe one bus access.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
On the transmission loop, check for remaining bytes at the loop
condition.
This way we can handle transmissions of 0 bytes and clean the code.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Instead of enabling the IRQ and disabling it for every transaction.
Specially the small transactions (1,2 words) benefit from removing 3 bus
accesses.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Instead of checking the TX_FULL flag for every transaction, find out the
size of the buffer at probe time and use it.
To avoid situations where the core had some data on the buffer before
initialization, the core is reseted before the buffer size is detected
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
DSPI module need cs change information in
a spi transfer. According to cs change, DSPI
will give last data the right flag. Bitbang
provide cs change behind the last data in
a transfer. So DSPI can not deal the last
data in every transfer properly, so remove
the bitbang in the driver.
Signed-off-by: Bhuvanchandra DV <bhuvanchandra.dv@toradex.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
This is a patch for adding gpio control about enable/disable of buck.
Signed-off-by: James Ban <james.ban.opensource@diasemi.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
The pl08x driver originally selected S3C64XX_PL080 to avoid having
the legacy Samsung DMA interfaces. Those are now gone, so the
select is no longer needed, but it now causes problems when
CONFIG_DMA_ENGINE is disabled:
arch/arm/plat-samsung/built-in.o: In function `s3c64xx_spi0_set_platdata':
:(.init.text+0x518): undefined reference to `pl08x_filter_id'
This simply removes the 'select' to avoid this problem.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
We currently get a warning about potentially uninitialized variables
in the rockchip spi driver, at least in certain toolchain versions:
spi/spi-rockchip.c: In function 'rockchip_spi_prepare_dma':
include/linux/dmaengine.h:796:2: warning: 'txdesc' may be used uninitialized in this function
include/linux/dmaengine.h:796:2: warning: 'rxdesc' may be used uninitialized in this function
The reason seems to be that gcc cannot know whether the value
of the rs->rx and rs->tx variables change between the two points
these are accessed.
The code is actually correct, but to make this clearer to the
compiler, this changes the conditionals to test for the local
rxdesc/txdesc variables instead, which it knows won't change.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Since commit f2c3c67f00 (merge commit that adds commit "ARM: mvebu:
completely disable hardware I/O coherency"), we disable I/O coherency
on Armada EBU platforms.
However, we continue to initialize the coherency fabric, because this
coherency fabric is needed on Armada XP for inter-CPU
coherency. Unfortunately, due to this, we also continued to execute
the coherency fabric initialization code for Armada 375/38x, which
switched the PL310 into I/O coherent mode. This has the effect of
disabling the outer cache sync operation: this is needed when I/O
coherency is enabled to work around a PCIe/L2 deadlock. But obviously,
when I/O coherency is disabled, having the outer cache sync operation
is crucial.
Therefore, this commit fixes the armada_375_380_coherency_init() so
that the PL310 is switched to I/O coherent mode only if I/O coherency
is enabled.
Without this fix, all devices using DMA are broken on Armada 375/38x.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Cc: <stable@vger.kernel.org> # v3.8+
uClibc Linuxthreads.old doesn't support the pthread_attr_setaffinity_np()
functioo:
----------------->8-----------------------
CC bench/futex-hash.o
CC bench/futex-wake.o
bench/futex-hash.c: In function 'bench_futex_hash':
bench/futex-hash.c:161:3: error: implicit declaration of function
'pthread_attr_setaffinity_np' [-Werror=implicit-function-declaration]
ret = pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t),
&cpu);
^
bench/futex-hash.c:161:3: error: nested extern declaration of
'pthread_attr_setaffinity_np' [-Werror=nested-externs]
----------------->8-----------------------
So introduce a test to check that and if not available provide a stub.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1421156604-30603-6-git-send-email-vgupta@synopsys.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When running perf on ARC (uClibc based userspace), ran into this issue
------------->8----------------
[ARCLinux]$ ./perf record ls
bin etc perf sys
debug init perf.data tmp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (~24 samples) ]
[ARCLinux]$ ./perf report
incompatible file format (rerun with -v to learn more)
------------->8----------------
The problem happens in the following call stack when zalloc is called
with size zero
glibc default / uClibc with MALLOC_GLIBC_COMPAT are OK, but not if that
config option is not enabled.
cmd_report
perf_session__new
perf_session__open
perf_session__read_header
read_attr(fd, header, &f_attr)
nr_ids = f_attr.ids.size / sizeof(u64); <-- 0
perf_evsel__alloc_id(vsel, 1, nr_ids)
zalloc(ncpus * nthreads * sizeof(u64)) <-- 0
header.c: read_attr()
(gdb) p *f_attr
$17 = {
attr = {
type = 0,
size = 96,
config = 0,
{
sample_period = 4000,
sample_freq = 4000
},
...
ids = {
offset = 104,
size = 0 <------
}
}
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1421156604-30603-5-git-send-email-vgupta@synopsys.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
You can't modify the metadata in these modes. It's better to fail these
messages immediately than let the block-manager deny write locks on
metadata blocks. Otherwise these failed metadata changes will trigger
'needs_check' to get set in the metadata superblock -- requiring repair
using the thin_check utility.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Commit 9b1cc9f251 ("dm cache: share cache-metadata object across
inactive and active DM tables") mistakenly ignored the use of ERR_PTR
returns. Restore missing IS_ERR checks and ERR_PTR returns where
appropriate.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
The new hw_breakpoint bits are now ready for v3.20, merge them
into the main branch, to avoid conflicts.
Conflicts:
tools/perf/Documentation/perf-record.txt
Signed-off-by: Ingo Molnar <mingo@kernel.org>
User visible:
- Enable sampling loads and stores simultaneously in 'perf mem' (Stephane Eranian)
- 'perf diff' output improvements (Namhyung Kim)
- Fix error reporting for evsel pgfault constructor (Arnaldo Carvalho de Melo)
Infrastructure:
- Move debugfs sterrno like method to tools/lib/ so that it may be used by
other tools, as 'perf probe' will be soon (Arnaldo Carvalho de Melo)
- Introduce function fro deleting/removing hist_entry to avoid code duplication
(Arnaldo Carvalho de Melo)
- Support parsing parameterized events (Cody P Schafer)
- Add support for IP address formats in libtraceevent (David Ahern)
- Fix typo in sample-parsing.c 'perf test' entry (Rasmus Villemoes)
- Remove some unused functions from color.c (Rickard Strandqvist)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUxnhhAAoJEBpxZoYYoA71tmMIAL8vvd8zzF92hMB5vmfQia37
WjPzdwMZxBy3noQztZdega+VcH+B9kpKI7Aes3knBb+SsBIR7IZHJGlI23cyWe83
7iDJWNXtEQhBt7y94zTulEdmxeIxiHMItRVspJpfdGlZ6VaVWZKsTKNWFqjvm7mC
fjcfbMXf0itsKky5spvg/425+rO//6c1jmprWTtHmIt7B5ysO+9UdMmOqFZlKiFM
YTbwpwTMMHhdlrLYm7+CqyEH4pemEtf10KVathf7rlmfHMJllyHy3AnJh9N0dN8L
XaokVxe4BYcfKCpyXDlKDH6s1Rf0+6JyRKu+hnDKY9y86RV0HV1L5mgcVInc+Nw=
=CxtR
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
"
User visible changes:
- Enable sampling loads and stores simultaneously in 'perf mem' (Stephane Eranian)
- 'perf diff' output improvements (Namhyung Kim)
- Fix error reporting for evsel pgfault constructor (Arnaldo Carvalho de Melo)
Infrastructure changes:
- Move debugfs sterrno like method to tools/lib/ so that it may be used by
other tools, as 'perf probe' will be soon (Arnaldo Carvalho de Melo)
- Introduce function fro deleting/removing hist_entry to avoid code duplication
(Arnaldo Carvalho de Melo)
- Support parsing parameterized events (Cody P Schafer)
- Add support for IP address formats in libtraceevent (David Ahern)
- Fix typo in sample-parsing.c 'perf test' entry (Rasmus Villemoes)
- Remove some unused functions from color.c (Rickard Strandqvist)
"
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
User visible:
- Fix probing at function return (Namhyumg Kim)
Developer stuff:
- Symbol processing changes necessary for fixing support for
kretprobes in 'perf probe' (Namhyung Kim, Arnaldo Carvalho de Melo)
- Annotation memory leaks and instruction parsing fixes (Rabin Vincent)
- Fix perl build on ARM64 (Wang Nam)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUv7uZAAoJEBpxZoYYoA71FocIAKPHkpS3qr1xS1aP3KKjS/0m
PGwsZmx6M92HPB1dHDgyZKmcdd6LL+QNwqXeXinwYhgnH62uvZCRWURlcGYcD/NU
cPTQJzmxDyXNHJGRPyDcYRZS9CFp5A7ccmpmcW7Zw6FgR+JLOcW0A7WkkmetxlYo
OyvbmTuWR+jT/PhfLRvriNPo2gbY7IV3xstJW7lP4R6VXcvpA3RJRKckQRhi9FBR
FcJzyITcaGKVe7f3sY3SAOrBB/bvmIK2EHk9qcMrwlR75FHVrWeuD7J/WiVCmGtG
VUEfl/EmGiBIwPNl268GoImzqJcAVY2EQmnHsSEQwVIRuJtjFc07jGH+Tg2XaUs=
=XNsc
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
" User visible fixes:
- Fix probing at function return (Namhyumg Kim)
Developer visible fixes:
- Symbol processing changes necessary for fixing support for
kretprobes in 'perf probe' (Namhyung Kim, Arnaldo Carvalho de Melo)
- Annotation memory leaks and instruction parsing fixes (Rabin Vincent)
- Fix perl build on ARM64 (Wang Nam)
"
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
of this is an IST rework. When an IST exception interrupts user
space, we will handle it on the per-thread kernel stack instead of
on the IST stack. This sounds messy, but it actually simplifies the
IST entry/exit code, because it eliminates some ugly games we used
to play in order to handle rescheduling, signal delivery, etc on the
way out of an IST exception.
The IST rework introduces proper context tracking to IST exception
handlers. I haven't seen any bug reports, but the old code could
have incorrectly treated an IST exception handler as an RCU extended
quiescent state.
The memory failure change (included in this pull request with
Borislav and Tony's permission) eliminates a bunch of code that
is no longer needed now that user memory failure handlers are
called in process context.
Finally, this includes a few on Denys' uncontroversial and Obviously
Correct (tm) cleanups.
The IST and memory failure changes have been in -next for a while.
LKML references:
IST rework:
http://lkml.kernel.org/r/cover.1416604491.git.luto@amacapital.net
Memory failure change:
http://lkml.kernel.org/r/54ab2ffa301102cd6e@agluck-desk.sc.intel.com
Denys' cleanups:
http://lkml.kernel.org/r/1420927210-19738-1-git-send-email-dvlasenk@redhat.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJUtvkFAAoJEK9N98ZeDfrkcfsIAJxZ0UBUCEDvulbqgk/iPGOa
fIpKLMowS7CpKtw6Wdc/YvAIkeHXWm1vU44Hj0TrjSrXCgVF8yCngs/xlXtOjoa1
dosXQqgqVJJ+hyui7chAEWyalLW7bEO8raq/6snhiMrhiuEkVKpEr7Fer4FVVCZL
4VALmNQQsbV+Qq4pXIhuagZC0Nt/XKi/+/cKvhS4p//q1F/TbHTz0FpDUrh0jPMh
18WFy0jWgxdkMRnSp/wJhekvdXX6PwUy5BdES9fjw8LQJZxxFpqN3Fe1kgfyzV0k
yuvEHw1hPt2aBGj3q69wQvDVyyn4OqMpRDBhk4S+GJYmVh7mFyFMN4BDMEy/EY8=
=LXVl
-----END PGP SIGNATURE-----
Merge tag 'pr-20150114-x86-entry' of git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux into x86/asm
Pull x86/entry enhancements from Andy Lutomirski:
" This is my accumulated x86 entry work, part 1, for 3.20. The meat
of this is an IST rework. When an IST exception interrupts user
space, we will handle it on the per-thread kernel stack instead of
on the IST stack. This sounds messy, but it actually simplifies the
IST entry/exit code, because it eliminates some ugly games we used
to play in order to handle rescheduling, signal delivery, etc on the
way out of an IST exception.
The IST rework introduces proper context tracking to IST exception
handlers. I haven't seen any bug reports, but the old code could
have incorrectly treated an IST exception handler as an RCU extended
quiescent state.
The memory failure change (included in this pull request with
Borislav and Tony's permission) eliminates a bunch of code that
is no longer needed now that user memory failure handlers are
called in process context.
Finally, this includes a few on Denys' uncontroversial and Obviously
Correct (tm) cleanups.
The IST and memory failure changes have been in -next for a while.
LKML references:
IST rework:
http://lkml.kernel.org/r/cover.1416604491.git.luto@amacapital.net
Memory failure change:
http://lkml.kernel.org/r/54ab2ffa301102cd6e@agluck-desk.sc.intel.com
Denys' cleanups:
http://lkml.kernel.org/r/1420927210-19738-1-git-send-email-dvlasenk@redhat.com
"
This tree semantically depends on and is based on the following RCU commit:
734d168013 ("rcu: Make rcu_nmi_enter() handle nesting")
... and for that reason won't be pushed upstream before the RCU bits hit Linus's tree.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This effectively reverts the last hunk of 392a9dad7e ("rbd: detect
when clone image is flattened").
The problem with parent_overlap != 0 condition is that it's possible
and completely valid to have an image with parent_overlap == 0 whose
parent state needs to be cleaned up on unmap. The next commit, which
drops the "clone image now standalone" logic, opens up another window
of opportunity to hit this, but even without it
# cat parent-ref.sh
#!/bin/bash
rbd create --image-format 2 --size 1 foo
rbd snap create foo@snap
rbd snap protect foo@snap
rbd clone foo@snap bar
rbd resize --allow-shrink --size 0 bar
rbd resize --size 1 bar
DEV=$(rbd map bar)
rbd unmap $DEV
leaves rbd_device/rbd_spec/etc and rbd_client along with ceph_client
hanging around.
My thinking behind calling rbd_dev_parent_put() unconditionally is that
there shouldn't be any requests in flight at that point in time as we
are deep into unmap sequence. Hence, even if rbd_dev_unparent() caused
by flatten is delayed by in-flight requests, it will have finished by
the time we reach rbd_dev_unprobe() caused by unmap, thus turning
unconditional rbd_dev_parent_put() into a no-op.
Fixes: http://tracker.ceph.com/issues/10352
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>
The comment for rbd_dev_parent_get() said
* We must get the reference before checking for the overlap to
* coordinate properly with zeroing the parent overlap in
* rbd_dev_v2_parent_info() when an image gets flattened. We
* drop it again if there is no overlap.
but the "drop it again if there is no overlap" part was missing from
the implementation. This lead to absurd parent_ref values for images
with parent_overlap == 0, as parent_ref was incremented for each
img_request and virtually never decremented.
Fix this by leveraging the fact that refresh path calls
rbd_dev_v2_parent_info() under header_rwsem and use it for read in
rbd_dev_parent_get(), instead of messing around with atomics. Get rid
of barriers in rbd_dev_v2_parent_info() while at it - I don't see what
they'd pair with now and I suspect we are in a pretty miserable
situation as far as proper locking goes regardless.
Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>
The fix from 9fc81d8742 ("perf: Fix events installation during
moving group") was incomplete in that it failed to recognise that
creating a group with events for different CPUs is semantically
broken -- they cannot be co-scheduled.
Furthermore, it leads to real breakage where, when we create an event
for CPU Y and then migrate it to form a group on CPU X, the code gets
confused where the counter is programmed -- triggered in practice
as well by me via the perf fuzzer.
Fix this by tightening the rules for creating groups. Only allow
grouping of counters that can be co-scheduled in the same context.
This means for the same task and/or the same cpu.
Fixes: 9fc81d8742 ("perf: Fix events installation during moving group")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20150123125834.090683288@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>