With device intialization done in the add_device call-back
now there is no reason for this function anymore.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
A device that might be used for HSA needs to be in a direct
mapped domain so that all DMA-API mappings stay alive when
the IOMMUv2 stack is used.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This enables allocation of DMA-API default domains from the
IOMMU core and switches allocation of domain dma-api domain
to the IOMMU core too.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Implement these two iommu-ops call-backs to make use of the
initialization and notifier features of the iommu core.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When the DEBUG preprocessor macro is defined the ps3_gelic_net driver build
fails due to an undeclared routine gelic_descr_get_status(). This problem
was introduced during the code cleanup of commit
6b0c21cede (net: Fix p3_gelic_net sparse warnings),
which re-arranged the ordering of some of the gelic routines.
This change just moves the gelic_descr_get_status() routine up in the
ps3_gelic_net.c source file. There is no functional change.
Fixes build errors like these:
drivers/net/ethernet/toshiba/ps3_gelic_net.c: error: implicit declaration of function gelic_descr_get_status
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add strings array of the current supported tunable options.
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have code to choose between several options, eg. -mabi=elfv2 vs
-mcall-aixdesc, and -mcmodel=medium vs -mminimal-toc. But these are all
GCC specific, so use cc-option on all of them.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We added -mno-strict-align in commit f036b36819 (powerpc: Work around little
endian gcc bug) to fix gcc bug http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57134
Clang doesn't understand it. We need to use a conditional because we can't use the
simpler call cc-option here.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
These options are not recognised on LLVM, so use call cc-option to check
for support.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The -mabi=altivec option is not recognised on LLVM, so use call cc-option
to check for support.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We see a large number of duplicate const errors in the user access
code when building with llvm/clang:
include/linux/pagemap.h:576:8: warning: duplicate 'const' declaration specifier
[-Wduplicate-decl-specifier]
ret = __get_user(c, uaddr);
The problem is we are doing const __typeof__(*(ptr)), which will hit the
warning if ptr is marked const.
Removing const does not seem to have any effect on GCC code generation.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Florian Fainelli says:
====================
net: broadcom MDIO support for broken turn-around
These two patches update the GENET and UniMAC MDIO controllers to deal with
PHYs that are known to have a broken turn-around bug (e.g: BCM53125 and others)
This utilizes the infrastructure that code recently added to do that in 'net-next'.
Note that the changes look nearly identical and I will try to address the MDIO
code duplication between GENET and UniMAC in a future patch series.
Changes in v2:
- remove brcmphy.h include in mdio-bcm-unimac.c
- use the same comment as with GENET's MDIO read function
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Some Ethernet PHYs/switches such as Broadcom's BCM53125 have a hardware bug
which makes them not release the MDIO line during turn-around time. This gets
flagged by the UniMAC MDIO controller as a read failure, and we fail the read
transaction.
Check the MDIO bus phy_ignore_ta_mask bitmask for the PHY we are reading
from and if it is listed in this bitmask, ignore the read failure and
proceed with returning the data we read out of the controller.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some Ethernet PHYs/switches such as Broadcom's BCM53125 have a hardware
bug which makes them not release the MDIO line during turn-around time.
This gets flagged by the GENET MDIO controller as a read failure, and we
fail the read transaction.
Check the MDIO bus phy_ignore_ta_mask bitmask for the PHY we are reading
from and if it is listed in this bitmask, ignore the read failure and
proceed with returning the data we read out of the controller.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add PHY IDs for Davicom DM9161B and DM9161C variants.
Tested with a DM9161C on a custom Atmel-based SAM9X25 board in RMII
mode.
The DM9161B uses the same model id with just the LSB bit of the version
id changing (which is masked out).
For all intents and purposes they're the same as the DM9161A with an
added GPSI mode and better fabrication process.
Signed-off-by: Gustavo Zacarias <gustavo@zacarias.com.ar>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ethernet AVB device includes the gPTP timer, so we can implement a PTP clock
driver. We're doing that in a separate file, with the main Ethernet driver
calling the PTP driver's [de]initialization and interrupt handler functions.
Unfortunately, the clock seems tightly coupled with the AVB-DMAC, so when that
one leaves the operation mode, we have to unregister the PTP clock... :-(
Based on the original patches by Masaru Nagai.
Signed-off-by: Masaru Nagai <masaru.nagai.vx@renesas.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ethernet AVB includes an Gigabit Ethernet controller (E-MAC) that is basically
compatible with SuperH Gigabit Ethernet E-MAC. Ethernet AVB has a dedicated
direct memory access controller (AVB-DMAC) that is a new design compared to the
SuperH E-DMAC. The AVB-DMAC is compliant with 3 standards formulated for IEEE
802.1BA: IEEE 802.1AS timing and synchronization protocol, IEEE 802.1Qav real-
time transfer, and the IEEE 802.1Qat stream reservation protocol.
The driver only supports device tree probing, so the binding document is
included in this patch.
Based on the original patches by Mitsuhiro Kimura.
Signed-off-by: Mitsuhiro Kimura <mitsuhiro.kimura.kc@renesas.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CAIA Delay-Gradient (CDG) is a TCP congestion control that modifies
the TCP sender in order to [1]:
o Use the delay gradient as a congestion signal.
o Back off with an average probability that is independent of the RTT.
o Coexist with flows that use loss-based congestion control, i.e.,
flows that are unresponsive to the delay signal.
o Tolerate packet loss unrelated to congestion. (Disabled by default.)
Its FreeBSD implementation was presented for the ICCRG in July 2012;
slides are available at http://www.ietf.org/proceedings/84/iccrg.html
Running the experiment scenarios in [1] suggests that our implementation
achieves more goodput compared with FreeBSD 10.0 senders, although it also
causes more queueing delay for a given backoff factor.
The loss tolerance heuristic is disabled by default due to safety concerns
for its use in the Internet [2, p. 45-46].
We use a variant of the Hybrid Slow start algorithm in tcp_cubic to reduce
the probability of slow start overshoot.
[1] D.A. Hayes and G. Armitage. "Revisiting TCP congestion control using
delay gradients." In Networking 2011, pages 328-341. Springer, 2011.
[2] K.K. Jonassen. "Implementing CAIA Delay-Gradient in Linux."
MSc thesis. Department of Informatics, University of Oslo, 2015.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: David Hayes <davihay@ifi.uio.no>
Cc: Andreas Petlund <apetlund@simula.no>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Nicolas Kuhn <nicolas.kuhn@telecom-bretagne.eu>
Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upcoming tcp_cdg uses tcp_enter_cwr() to initiate PRR. Export this
function so that CDG can be compiled as a module.
Cc: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: David Hayes <davihay@ifi.uio.no>
Cc: Andreas Petlund <apetlund@simula.no>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Nicolas Kuhn <nicolas.kuhn@telecom-bretagne.eu>
Signed-off-by: Kenneth Klette Jonassen <kennetkl@ifi.uio.no>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function can be called by an IOMMU driver to request
that a device's default domain is direct mapped.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
If CONFIG_NET_SWITCHDEV is enabled, but port driver does not implement
support for IPv4 FIB add/del ops, don't fail route add/del offload
operations. Route adds will not be marked as OFFLOAD. Routes will be
installed in the kernel FIB, as usual.
This was report/fixed by Florian when testing DSA driver with net-next on
devices with L2 offload support but no L3 offload support. What he reported
was an initial route installed from DHCP client would fail (route not
installed to kernel FIB). This was triggering the setting of
ipv4.fib_offload_disabled, which would disable route offloading after the
first failure. So subsequent attempts to install the route would succeed.
There is follow-on work/discussion to address the handling of route install
failures, but for now, let's differentiate between no support and failed
support.
Reported-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We do not check the return value of enic_dev_stats_dump(). If allocation
fails, we will hit NULL pointer reference.
Return only if memory allocation fails. For other failures, we return the
previously recorded values.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli says:
====================
net: phy: broadcom: define pseudo-PHY address
This patch series converts existing in-tree users of the Broadcom pseudo-PHY
address (30) used to configure MDIO-connected switches to share a constant in a
shared header files.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Utilize the newly introduced BRCM_PSEUDO_PHY_ADDR constant from
brcmphy.h instead of open-coding the Broadcom Ethernet switches
pseudo-PHY address (30).
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
What BGMAC defines as BGMAC_PHY_NOREGS is in fact the Broadcom Ethernet
switches' pseudo-PHY address (30), utilize the newly introduced constant
from brcmphy.h
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
What B44 has been locally using as B44_PHY_ADDR_NO_LOCAL_PHY is in fact
the Broadcom Ethernet switches pseudo-PHY address (30). Update the
header to use the newly introduced constant and update comments so they
are within 80 columns and consistent.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Define the pseudo-PHY address (30) which is used by all Broadcom
Ethernet switches in a shared header file.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We utilize inline functions from the PHY library, make sure that we do
include phy.h in brcmphy.h in order for the code including brcmphy.h not
to have to resolve this inclusion dependency.
Fixes: 705314797b ("net: phy: broadcom: move shadow 0x1C register accessors to brcmphy.h")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the crash kernel is loaded above 4GiB in memory, the
first kernel allocates only 72MiB of low-memory for the DMA
requirements of the second kernel. On systems with many
devices this is not enough and causes device driver
initialization errors and failed crash dumps. Testing by
SUSE and Redhat has shown that 256MiB is a good default
value for now and the discussion has lead to this value as
well. So set this default value to 256MiB to make sure there
is enough memory available for DMA.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
[ Reflow comment. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jörg Rödel <joro@8bytes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: kexec@lists.infradead.org
Link: http://lkml.kernel.org/r/1433500202-25531-4-git-send-email-joro@8bytes.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When we boot a kdump kernel in high memory, there is by
default only 72MB of low memory available. The swiotlb code
takes 64MB of it (by default) so that there are only 8MB
left to allocate from. On systems with many devices this
causes page allocator warnings from
dma_generic_alloc_coherent():
systemd-udevd: page allocation failure: order:0, mode:0x280d4
CPU: 0 PID: 197 Comm: systemd-udevd Tainted: G W
3.12.28-4-default #1 Hardware name: HP ProLiant DL980 G7, BIOS
P66 07/30/2012 ffff8800781335e0 ffffffff8150b1db 00000000000280d4 ffffffff8113af90
0000000000000000 0000000000000000 ffff88007efdbb00 0000000100000000
0000000000000000 0000000000000000 0000000000000000 0000000000000001
Call Trace:
dump_trace+0x7d/0x2d0
show_stack_log_lvl+0x94/0x170
show_stack+0x21/0x50
dump_stack+0x41/0x51
warn_alloc_failed+0xf0/0x160
__alloc_pages_slowpath+0x72f/0x796
__alloc_pages_nodemask+0x1ea/0x210
dma_generic_alloc_coherent+0x96/0x140
x86_swiotlb_alloc_coherent+0x1c/0x50
ttm_dma_pool_alloc_new_pages+0xab/0x320 [ttm]
ttm_dma_populate+0x3ce/0x640 [ttm]
ttm_tt_bind+0x36/0x60 [ttm]
ttm_bo_handle_move_mem+0x55f/0x5c0 [ttm]
ttm_bo_move_buffer+0x105/0x130 [ttm]
ttm_bo_validate+0xc1/0x130 [ttm]
ttm_bo_init+0x24b/0x400 [ttm]
radeon_bo_create+0x16c/0x200 [radeon]
radeon_ring_init+0x11e/0x2b0 [radeon]
r100_cp_init+0x123/0x5b0 [radeon]
r100_startup+0x194/0x230 [radeon]
r100_init+0x223/0x410 [radeon]
radeon_device_init+0x6af/0x830 [radeon]
radeon_driver_load_kms+0x89/0x180 [radeon]
drm_get_pci_dev+0x121/0x2f0 [drm]
local_pci_probe+0x39/0x60
pci_device_probe+0xa9/0x120
driver_probe_device+0x9d/0x3d0
__driver_attach+0x8b/0x90
bus_for_each_dev+0x5b/0x90
bus_add_driver+0x1f8/0x2c0
driver_register+0x5b/0xe0
do_one_initcall+0xf2/0x1a0
load_module+0x1207/0x1c70
SYSC_finit_module+0x75/0xa0
system_call_fastpath+0x16/0x1b
0x7fac533d2788
After these warnings the code enters a fall-back path and
allocated directly from the swiotlb aperture in the end.
So remove these warnings as this is not a fatal error.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
[ Simplify, reflow comment. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jörg Rödel <joro@8bytes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: kexec@lists.infradead.org
Link: http://lkml.kernel.org/r/1433500202-25531-3-git-send-email-joro@8bytes.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Print a warning when all allocation tries have been failed
and the function is about to return NULL.
This prepares for calling the function with __GFP_NOWARN to
suppress allocation failure warnings before all fall-backs
have failed - which we'll do to improve kdump behavior.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jörg Rödel <joro@8bytes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: kexec@lists.infradead.org
Link: http://lkml.kernel.org/r/1433500202-25531-2-git-send-email-joro@8bytes.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
dctcp_alpha can be read by from dctcp_get_info() without
synchro, so use WRITE_ONCE() to prevent compiler from using
dctcp_alpha as a temporary variable.
Also, playing with small dctcp_shift_g (like 1), can expose
an overflow with 32bit values shifted 9 times before divide.
Use an u64 field to avoid this problem, and perform the divide
only if acked_bytes_ecn is not zero.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* mesh fixes from Alexis Green and Chun-Yeow Yeoh,
* a documentation fix from Jakub Kicinski,
* a missing channel release (from Michal Kazior),
* a fix for a signal strength reporting bug (from Sara Sharon),
* handle deauth while associating (myself),
* don't report mangled TX SKB back to userspace for status (myself),
* handle aggregation session timeouts properly in fast-xmit (myself)
However, there are also a few cleanups and one big change that
affects all drivers (and that required me to pull in your tree)
to change the mac80211 HW flags to use an unsigned long bitmap
so that we can extend them more easily - we're running out of
flags even with a cleanup to remove the two unused ones.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJVeEQ1AAoJEDBSmw7B7bqrebcP/3v7I2ZXAeHag2W4hdD4YH6W
tuKfs3JKW3GDh84l2AJs2JBpFxR6Tk0Z7zGKrPLzkBTkkJkSLgKuUKR0+YQU6PYH
VfZ2NkdIHEqouLgMWxGGlp6suqp2yYD9tiIUroICXZ6aFm5trQuZgzv5ePI+lhmX
cWUYCawE2tcpVdg0NJsFExeCJhw81e/Bet1LCGHo0asWNpIK7phMdltzD7e4tgQS
4q475FCIkWxbxKgJRrRkz8J7grsjK1wf2W3acOxKMaoVBeqJVW5BWDrTgo0aDPts
qQ8n8t1s9o/jKQIvaz3RyjkQgX8T4vCMqkouLF4jJOThKIsUSi3Fvm9oKcMg4YhA
Ju5QWfbCBFhpLZeBzWzKyePTnDru1XDFFVdIATLONKTVg1modzFAs3j5gb4Z3Wtg
VYLoLWWpRtHKd9pzfZMhyWq64Xb8C+qlyQHr4r4QRm9ADz0Jq+OCh0rTFt+/bncM
CHxnf0VS9hEOFk0+TxFqi2yXOnv2uMgcN+jnGkEs4QuLfv9ML1Eb23ZjDoHxd1uq
1Yd4R8IDEY/KU6UJMwksz+gV/ekoB32eAhw56pxehgAMuZL4OgNvmeAQHx7Jq9it
0/OfAK2BSNH8odqYQbpg89C8keqSInMwUhFyRhyMJAWSKiPRHypsDBWxMKGJIssI
3mB4d/go+RP1AvZnazeF
=2wTw
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2015-06-10' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
For this round we mostly have fixes:
* mesh fixes from Alexis Green and Chun-Yeow Yeoh,
* a documentation fix from Jakub Kicinski,
* a missing channel release (from Michal Kazior),
* a fix for a signal strength reporting bug (from Sara Sharon),
* handle deauth while associating (myself),
* don't report mangled TX SKB back to userspace for status (myself),
* handle aggregation session timeouts properly in fast-xmit (myself)
However, there are also a few cleanups and one big change that
affects all drivers (and that required me to pull in your tree)
to change the mac80211 HW flags to use an unsigned long bitmap
so that we can extend them more easily - we're running out of
flags even with a cleanup to remove the two unused ones.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
SCM_SECURITY was originally only implemented for datagram sockets,
not for stream sockets. However, SCM_CREDENTIALS is supported on
Unix stream sockets. For consistency, implement Unix stream support
for SCM_SECURITY as well. Also clean up the existing code and get
rid of the superfluous UNIXSID macro.
Motivated by https://bugzilla.redhat.com/show_bug.cgi?id=1224211,
where systemd was using SCM_CREDENTIALS and assumed wrongly that
SCM_SECURITY was also supported on Unix stream sockets.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the timer API function setup_timer instead of structure field
assignments to initialize a timer.
A simplified version of the Coccinelle semantic patch that performs
this transformation is as follows:
@change@
expression e1, e2, a;
@@
-init_timer(&e1);
+setup_timer(&e1, a, 0UL);
... when != a = e2
-e1.function = a;
Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following patch V8 adds support for Cavium Liquidio pci express
based 10Gig ethernet adapters.
1) Consolidated all debug macros to either call dev_* or
netdev_* macros directly, feedback from previous patch.
2) Changed soft commands to avoid crash when running
in interrupt context.
3) Fixed link status not reflecting correct status when NetworkManager
is running. Added MODULE_FIRMWARE declarations.
Following were the previous patches.
Patch V7:
1) Minor comments from v6 release regarding debug statements.
2) Fix for large multicast lists.
3) Fixed lockup issue if port initialization fails.
4) Enabled MSI by default.
https://patchwork.ozlabs.org/patch/464441/
Patch V6:
1) Addressed the uint64 vs u64 issue, feedback from previous patch.
2) Consolidated some receive processing routines.
3) Removed link status polling method.
https://patchwork.ozlabs.org/patch/459514/
Patch V5:
Based on the feedback from earlier patches with regards to
consolidation of common functions like device init, register
programming for cn66xx and cn68xx devices.
https://patchwork.ozlabs.org/patch/438979/
Patch V4:
Following were the changes based on the feedback from earlier patch:
1) Added mmiowb while synchronizing queue updates and other hw
interactions.
2) Statistics will now be incremented non-atomically per each ring.
liquidio_get_stats will add stats of each ring while reporting the
total statistics counts.
3) Modified liquidio_ioctl to return proper return codes.
4) Modified device naming to use standard Ethernet naming.
5) Global function names in the driver will have lio_/liquidio_/octeon_
prefix.
6) Ethtool related changes for:
Removed redundant stats and jiffies.
Use default ethtool handler of link status.
Speed setting will make use of ethtool_cmd_speed_set.
7) Added checks for pci_map_* return codes.
8) Check for signals while waiting in interruptible mode
https://patchwork.ozlabs.org/patch/435073/
Patch v3:
Implemented feedback from previous patch like:
Removed NAPI Config and DEBUG config options, added BQL and xmit_more
support.
https://patchwork.ozlabs.org/patch/422749/
Patch V2:
Implemented feedback from previous patch.
https://patchwork.ozlabs.org/patch/413539/
First Patch:
https://patchwork.ozlabs.org/patch/412946/
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Satanand Burla <satananda.burla@caviumnetworks.com>
Signed-off-by: Felix Manlunas <felix.manlunas@caviumnetworks.com>
Signed-off-by: Robert Richter <Robert.Richter@caviumnetworks.com>
Signed-off-by: Aleksey Makarov <Aleksey.Makarov@caviumnetworks.com>
Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds create/remove window ioctls to create and remove DMA windows.
sPAPR defines a Dynamic DMA windows capability which allows
para-virtualized guests to create additional DMA windows on a PCI bus.
The existing linux kernels use this new window to map the entire guest
memory and switch to the direct DMA operations saving time on map/unmap
requests which would normally happen in a big amounts.
This adds 2 ioctl handlers - VFIO_IOMMU_SPAPR_TCE_CREATE and
VFIO_IOMMU_SPAPR_TCE_REMOVE - to create and remove windows.
Up to 2 windows are supported now by the hardware and by this driver.
This changes VFIO_IOMMU_SPAPR_TCE_GET_INFO handler to return additional
information such as a number of supported windows and maximum number
levels of TCE tables.
DDW is added as a capability, not as a SPAPR TCE IOMMU v2 unique feature
as we still want to support v2 on platforms which cannot do DDW for
the sake of TCE acceleration in KVM (coming soon).
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for the vfio related changes]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The existing implementation accounts the whole DMA window in
the locked_vm counter. This is going to be worse with multiple
containers and huge DMA windows. Also, real-time accounting would requite
additional tracking of accounted pages due to the page size difference -
IOMMU uses 4K pages and system uses 4K or 64K pages.
Another issue is that actual pages pinning/unpinning happens on every
DMA map/unmap request. This does not affect the performance much now as
we spend way too much time now on switching context between
guest/userspace/host but this will start to matter when we add in-kernel
DMA map/unmap acceleration.
This introduces a new IOMMU type for SPAPR - VFIO_SPAPR_TCE_v2_IOMMU.
New IOMMU deprecates VFIO_IOMMU_ENABLE/VFIO_IOMMU_DISABLE and introduces
2 new ioctls to register/unregister DMA memory -
VFIO_IOMMU_SPAPR_REGISTER_MEMORY and VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY -
which receive user space address and size of a memory region which
needs to be pinned/unpinned and counted in locked_vm.
New IOMMU splits physical pages pinning and TCE table update
into 2 different operations. It requires:
1) guest pages to be registered first
2) consequent map/unmap requests to work only with pre-registered memory.
For the default single window case this means that the entire guest
(instead of 2GB) needs to be pinned before using VFIO.
When a huge DMA window is added, no additional pinning will be
required, otherwise it would be guest RAM + 2GB.
The new memory registration ioctls are not supported by
VFIO_SPAPR_TCE_IOMMU. Dynamic DMA window and in-kernel acceleration
will require memory to be preregistered in order to work.
The accounting is done per the user process.
This advertises v2 SPAPR TCE IOMMU and restricts what the userspace
can do with v1 or v2 IOMMUs.
In order to support memory pre-registration, we need a way to track
the use of every registered memory region and only allow unregistration
if a region is not in use anymore. So we need a way to tell from what
region the just cleared TCE was from.
This adds a userspace view of the TCE table into iommu_table struct.
It contains userspace address, one per TCE entry. The table is only
allocated when the ownership over an IOMMU group is taken which means
it is only used from outside of the powernv code (such as VFIO).
As v2 IOMMU supports IODA2 and pre-IODA2 IOMMUs (which do not support
DDW API), this creates a default DMA window for IODA2 for consistency.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for the vfio related changes]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
We are adding support for DMA memory pre-registration to be used in
conjunction with VFIO. The idea is that the userspace which is going to
run a guest may want to pre-register a user space memory region so
it all gets pinned once and never goes away. Having this done,
a hypervisor will not have to pin/unpin pages on every DMA map/unmap
request. This is going to help with multiple pinning of the same memory.
Another use of it is in-kernel real mode (mmu off) acceleration of
DMA requests where real time translation of guest physical to host
physical addresses is non-trivial and may fail as linux ptes may be
temporarily invalid. Also, having cached host physical addresses
(compared to just pinning at the start and then walking the page table
again on every H_PUT_TCE), we can be sure that the addresses which we put
into TCE table are the ones we already pinned.
This adds a list of memory regions to mm_context_t. Each region consists
of a header and a list of physical addresses. This adds API to:
1. register/unregister memory regions;
2. do final cleanup (which puts all pre-registered pages);
3. do userspace to physical address translation;
4. manage usage counters; multiple registration of the same memory
is allowed (once per container).
This implements 2 counters per registered memory region:
- @mapped: incremented on every DMA mapping; decremented on unmapping;
initialized to 1 when a region is just registered; once it becomes zero,
no more mappings allowe;
- @used: incremented on every "register" ioctl; decremented on
"unregister"; unregistration is allowed for DMA mapped regions unless
it is the very last reference. For the very last reference this checks
that the region is still mapped and returns -EBUSY so the userspace
gets to know that memory is still pinned and unregistration needs to
be retried; @used remains 1.
Host physical addresses are stored in vmalloc'ed array. In order to
access these in the real mode (mmu off), there is a real_vmalloc_addr()
helper. In-kernel acceleration patchset will move it from KVM to MMU code.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Before the IOMMU user (VFIO) would take control over the IOMMU table
belonging to a specific IOMMU group. This approach did not allow sharing
tables between IOMMU groups attached to the same container.
This introduces a new IOMMU ownership flavour when the user can not
just control the existing IOMMU table but remove/create tables on demand.
If an IOMMU implements take/release_ownership() callbacks, this lets
the user have full control over the IOMMU group. When the ownership
is taken, the platform code removes all the windows so the caller must
create them.
Before returning the ownership back to the platform code, VFIO
unprograms and removes all the tables it created.
This changes IODA2's onwership handler to remove the existing table
rather than manipulating with the existing one. From now on,
iommu_take_ownership() and iommu_release_ownership() are only called
from the vfio_iommu_spapr_tce driver.
Old-style ownership is still supported allowing VFIO to run on older
P5IOC2 and IODA IO controllers.
No change in userspace-visible behaviour is expected. Since it recreates
TCE tables on each ownership change, related kernel traces will appear
more often.
This adds a pnv_pci_ioda2_setup_default_config() which is called
when PE is being configured at boot time and when the ownership is
passed from VFIO to the platform code.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[aw: for the vfio related changes]
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This adds a way for the IOMMU user to know how much a new table will
use so it can be accounted in the locked_vm limit before allocation
happens.
This stores the allocated table size in pnv_pci_ioda2_get_table_size()
so the locked_vm counter can be updated correctly when a table is
being disposed.
This defines an iommu_table_group_ops callback to let VFIO know
how much memory will be locked if a table is created.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The existing code programmed TVT#0 with some address and then
immediately released that memory.
This makes use of pnv_pci_ioda2_unset_window() and
pnv_pci_ioda2_set_bypass() which do correct resource release and
TVT update.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>