Profiling shows that the key validation is susceptible
to cache line trading when accessing the lkey table.
Fix by separating out the read mostly fields from the write
fields. In addition the shift amount, which is function
of the lkey table size, is precomputed and stored with the
table pointer. Since both the shift and table pointer
are in the same read mostly cacheline, this saves a cache
line in this hot path.
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Short circuit sdma_txclean() by adding an __sdma_txclean()
that is only called when the tx has sdma mappings.
Convert internal calls to __sdma_txclean().
This removes a call from the critical path.
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Profiling suggests that the read_seqbegin() in
the txreq put logic is colliding with other uses
of the iowait lock.
The packet at a time use of this lock dictates a unique
lock to avoid reader/writer collisions when the number
of vTxWait events is low.
In order to support a unique lock the iowait struct embedded
in the QP is extended to remember the lock that protects the queue
head.
The QP destroy removes that QP from any wait list. It doesn't
need to know the head because of the linked list API, but it does
need to know the lock required to protect the head.
This also opens up the wait logic to have unique per resources locks
which needs to be in future refinement.
Reviewed-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Remove IS_ERR check from caching code as the function being called does
not actually return error pointers.
Fixes: f19bd643db: "IB/hfi1: Prevent NULL pointer deferences in caching code"
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Increase the size of the buffer that is used to construct per-VL
and per-SDMA counter names.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
When processing ECN via the prescan_rxq path, some fields in the packet
structure are passed uninitialized. This can potentially
cause NULL pointer exceptions during ECN handling.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Set the status code BAD_L2 when unsupported type of packet
is received and dropped.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Validate the rcvhdrcnt module parameter in a single function at module
load time. This allows proper error reporting.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Krzysztof Blaszkowski <krzysztof.blaszkowski@intel.com>
Signed-off-by: Tymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The new s_rnr_timeout was not properly being set and the code was
incorrectly setting a different timer.
Found by code inspection.
Cc: <stable@vger.kernel.org> # 4.7.x
Fixes: 08279d5c94 ("staging/rdma/hfi1: use new RNR timer")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The lock is an unused vestige from qib. Remove it.
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
hfi1_pcie_ddinit takes the PCI device id as an argument but never
uses it. Clean it up.
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
A few snoop related variables were missed in the snoop/capture removal
to get out of staging. Go back and clean those up too.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
In the function hfi1_create_ctxts the array "dd->rcd" is allocated and
then populated with allocated resources in a loop. Previously, if
error happened during the loop, only resource allocated in the current
iteration would be freed. The array itself would then be freed, leaving
the resources that were allocated in previous iterations and referenced
by the array elements in limbo.
This patch makes sure all allocated resources are freed before freeing
the array "dd->rcd". Also the resource allocation now takes account of
the numa node the device is attached to.
Reviewed-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Clean up device type checking.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Krzysztof Blaszkowski <krzysztof.blaszkowski@intel.com>
Signed-off-by: Tymoteusz Kielan <tymoteusz.kielan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
This patch fixes an Oops on device unbind, when the device is used
by a PSM user process. PSM processes access device resources which
are freed on device removal. Similar protection exists in uverbs
in ib_core for Verbs clients, but PSM doesn't use ib_uverbs hence
a separate protection is required for PSM clients.
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dean Luick <dean.luick@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Prevent setting up integrity check flags when module is loaded
with NO_INTEGRITY capability.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The IRQ affinity entry is not needed after the irq notifier patch has been
added to the hfi1 driver.
The irq affinity settings for SDMA engine should be set using the standard
/proc/irq/<N>/ interface.
Reviewed-by: Jianxin Xiong <jianxin.xiong@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The initial code for rdmavt carried with it a restriction that was a
vestige from the qib driver, that to dma map a page it had to be less
than a page size. This is not the case on modern hardware, both qib and
hfi1 will be just fine with unaligned map requests.
This fixes a 4.8 regression where by an IPoIB transfer of > PAGE_SIZE
will hang because the dma map page call always fails. This was
introduced after commit 5faba54695 ("IB/ipoib: Report SG feature
regardless of HW UD CSUM capability") added the capability to use SG by
default. Rather than override this, the HW supports it, so allow SG.
Cc: Stable <stable@vger.kernel.org> # 4.8
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
A new file is created:
/sys/kernel/debug/tracing/trace_marker_raw
This allows for appications to create data structures and write the binary
data directly into it, and then read the trace data out from trace_pipe_raw
into the same type of data structure. This saves on converting numbers into
ASCII that would be required by trace_marker.
Suggested-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
All boards from the Tronsmart Vega S95 series are sharing similar MMC
based hardware.
sd_emmc_a is used to connect a Broadcom based SDIO wifi card (supported
by the brcmfmac driver). The 32.768KHz LPO clock for the wifi chip is
generated by PWM_E.
sd_emmc_b is routed to the SD-card. Unlike p20x there is no GPIO
regulator, meaning it only supports 3.3V (which seems to be hard-wired).
The eMMC chip is connected to sd_emmc_c and is implemented similar to
the meson-gxbb-p20x boards (meaning that hard-wired fixed regulators
are used).
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Enable Ethernet on the p23x board, pinctrl attribute is only added for
the p230 board since the p231 only uses the Internal PHY.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Add Ethernet node with Internal PHY selection for the Amlogic GXL SoCs
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
meson-gx.dtsi was directly derived from meson-gxbb.dtsi, so keep the
copyrights in chronological order to not give a wrong impression.
Fixes: c328666d58 ("ARM64: dts: amlogic: Add Meson GX dtsi from GXBB")
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Enable the Infraread Receiver on the p23x board.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Add SD/SDIO/MMC nodes and PWM 32768Hz clock configuration to provide
storage and WiFi functionality on the p23x boards.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Add clock node for Amlogic Meson GXL.
The GXBB compatible is retained since the GXBB clock tree is used for now.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Add pinctrl nodes and pin definitions for Amlogic Meson GXL.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
[khilman: use GXBB include until GXL pinctrl support merged]
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Move common nodes between GXBB and GXL in to the common GX dtsi.
Leave the clock attributes in the GXBB dtsi for now.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
Let's not depend on any of the BLK_MQ_RQ_QUEUE_* constants having
specific values. No functional change.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Let's not depend on any of the BLK_MQ_RQ_QUEUE_* constants having
specific values. No functional change.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
1,cleanup description/comments
2,for FIJI & passthrough, force post when smc fw version below 22.15
3,for other cases, follow regular rules
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Left over from an earlier rev of the patch.
Acked-by: Colin Ian King <colin.king@canonical.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Colin King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This shrinks down omap2plus_defconfig a bit and makes it easier for
people to generate minimal patches against it.
Signed-off-by: Tony Lindgren <tony@atomide.com>
A call to clk_get_rate appears to be called in the context of an interrupt,
cache the bus clock for the frequency calculations in transmission.
This fixes a 'BUG: scheduling while atomic' and
'WARNING: CPU: 0 PID: 777 at kernel/sched/core.c:2960 atmel_spi_unlock'
Signed-off-by: Ben Whitten <ben.whitten@lairdtech.com>
Signed-off-by: Steve deRosier <steve.derosier@lairdtech.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
completion variable should be reinitialized before reusing.
Signed-off-by: Prahlad V <prahlad.eee@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Let's add minimal support for droid 4 with MMC and WLAN working.
It can be booted with appended dtb using kexec to a state where
MMC and WLAN work with currently no support for it's PMIC or
display.
Note that we are currently using fixed regulators as we don't
have support for it's cpcap PMIC. I'll be posting regmap_spi
based minimal cpcap patches later on for USB and the debug
UART on droid 4 multiplexed with the USB connector.
Cc: Marcel Partap <mpartap@gmx.net>
Cc: Michael Scott <michael.scott@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
At least on powerpc with GCC 6, the compiler is smart enough to optimise
lkdtm_CORRUPT_STACK() into an empty function that just returns.
If we print the buffer after we've written to it that prevents the
compiler from optimising away data and the memset().
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds ARM Performance Monitor Unit dt node for exynos7.
PMU provides various statistics on the operation of the CPU and
memory system at runtime, which are very useful when debugging or
profiling code. This enables the same.
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
[krzk: Squashed with "Add level for cpu dt node for exynos7"]
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
If CPUs are moved to or removed from a rdtgroup, the percpu closid storage
is updated. If tasks running on an affected CPU use the percpu closid then
the PQR_ASSOC MSR is only updated when the task runs through a context
switch. Up to the context switch the CPUs operate on the wrong closid. This
state is potentially unbound.
Make the change immediately effective by invoking a smp function call on
the affected CPUs which stores the new closid in the perpu storage and
calls the rdt_sched_in() function which updates the MSR, if the current
task uses the percpu closid.
[ tglx: Made it work and massaged changelog once more ]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1478912558-55514-3-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
All CPUs in a rdtgroup are given back to the default rdtgroup before the
rdtgroup is removed during umount. After umount, the default rdtgroup
contains all online CPUs, but the per cpu closids are not cleared. As a
result the stale closid value will be used immediately after the next
mount.
Move all cpus to the default group and update the percpu closid storage.
[ tglx: Massaged changelong ]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1478912558-55514-2-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c: In function 'rdtgroup_kn_lock_live':
arch/x86/kernel/cpu/intel_rdt_rdtgroup.c:658:2: error: implicit declaration of
function 'kernfs_break_active_protection' [-Werror=implicit-function-declaration]
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Shaohua Li <shli@fb.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
The cpu online/offline callbacks of intel_rdt lock rdtgroup_mutex nested
inside of cpu hotplug lock. rdtgroup_cpus_write() does it in reverse order.
Remove the get/put_online_cpus() calls from rdtgroup_cpus_write(). This is
safe against cpu hotplug as the resource group cpumasks are protected by
rdtgroup_mutex.
Found by review, but should have been found if authors would have bothered
to test cpu hotplug with lockdep enabled.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Shaohua Li <shli@fb.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Tony Luck <tony.luck@intel.com>