linux

Author	SHA1	Message	Date
Vlad Yasevich	75a75ee46b	mlx4: Remove driver specific fdb handlers. Remove driver specific fdb hadlers since they are the same as the default ones. CC: Amir Vadai <amirv@mellanox.com> CC: Yan Burman <yanb@mellanox.com> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 15:29:46 -05:00
Sasha Levin	b67bfe0d42	hlist: drop the node parameter from iterators I'm not sure why, but the hlist for each entry iterators were conceived list_for_each_entry(pos, head, member) The hlist ones were greedy and wanted an extra parameter: hlist_for_each_entry(tpos, pos, head, member) Why did they need an extra pos parameter? I'm not quite sure. Not only they don't really need it, it also prevents the iterator from looking exactly like the list iterator, which is unfortunate. Besides the semantic patch, there was some manual work required: - Fix up the actual hlist iterators in linux/list.h - Fix up the declaration of other iterators based on the hlist ones. - A very small amount of places were using the 'node' parameter, this was modified to use 'obj->member' instead. - Coccinelle didn't handle the hlist_for_each_entry_safe iterator properly, so those had to be fixed up manually. The semantic patch which is mostly the work of Peter Senna Tschudin is here: @@ iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host; type T; expression a,c,d,e; identifier b; statement S; @@ -T b; <+... when != b ( hlist_for_each_entry(a, - b, c, d) S \| hlist_for_each_entry_continue(a, - b, c) S \| hlist_for_each_entry_from(a, - b, c) S \| hlist_for_each_entry_rcu(a, - b, c, d) S \| hlist_for_each_entry_rcu_bh(a, - b, c, d) S \| hlist_for_each_entry_continue_rcu_bh(a, - b, c) S \| for_each_busy_worker(a, c, - b, d) S \| ax25_uid_for_each(a, - b, c) S \| ax25_for_each(a, - b, c) S \| inet_bind_bucket_for_each(a, - b, c) S \| sctp_for_each_hentry(a, - b, c) S \| sk_for_each(a, - b, c) S \| sk_for_each_rcu(a, - b, c) S \| sk_for_each_from -(a, b) +(a) S + sk_for_each_from(a) S \| sk_for_each_safe(a, - b, c, d) S \| sk_for_each_bound(a, - b, c) S \| hlist_for_each_entry_safe(a, - b, c, d, e) S \| hlist_for_each_entry_continue_rcu(a, - b, c) S \| nr_neigh_for_each(a, - b, c) S \| nr_neigh_for_each_safe(a, - b, c, d) S \| nr_node_for_each(a, - b, c) S \| nr_node_for_each_safe(a, - b, c, d) S \| - for_each_gfn_sp(a, c, d, b) S + for_each_gfn_sp(a, c, d) S \| - for_each_gfn_indirect_valid_sp(a, c, d, b) S + for_each_gfn_indirect_valid_sp(a, c, d) S \| for_each_host(a, - b, c) S \| for_each_host_safe(a, - b, c, d) S \| for_each_mesh_entry(a, - b, c, d) S ) ...+> [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c] [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c] [akpm@linux-foundation.org: checkpatch fixes] [akpm@linux-foundation.org: fix warnings] [akpm@linux-foudnation.org: redo intrusive kvm changes] Tested-by: Peter Senna Tschudin <peter.senna@gmail.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-27 19:10:24 -08:00
Linus Torvalds	1cef9350cb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) ping_err() ICMP error handler looks at wrong ICMP header, from Li Wei. 2) TCP socket hash function on ipv6 is too weak, from Eric Dumazet. 3) netif_set_xps_queue() forgets to drop mutex on errors, fix from Alexander Duyck. 4) sum_frag_mem_limit() can deadlock due to lack of BH disabling, fix from Eric Dumazet. 5) TCP SYN data is miscalculated in tcp_send_syn_data(), because the amount of TCP option space was not taken into account properly in this code path. Fix from yuchung Cheng. 6) MLX4 driver allocates device queues with the wrong size, from Kleber Sacilotto. 7) sock_diag can access past the end of the sock_diag_handlers[] array, from Mathias Krause. 8) vlan_set_encap_proto() makes incorrect assumptions about where skb->data points, rework the logic so that it works regardless of where skb->data happens to be. From Jesse Gross. 9) Fix gianfar build failure with NET_POLL enabled, from Paul Gortmaker. 10) Fix Ipv4 ID setting and checksum calculations in GRE driver, from Pravin B Shelar. 11) bgmac driver does: int i; for (i = 0; ...; ...) { ... for (i = 0; ...; ...) { effectively corrupting the outer loop index, use a seperate variable for the inner loops. From Rafał Miłecki. 12) Fix suspend bugs in smsc95xx driver, from Ming Lei. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits) usbnet: smsc95xx: rename FEATURE_AUTOSUSPEND usbnet: smsc95xx: fix broken runtime suspend usbnet: smsc95xx: fix suspend failure bgmac: fix indexing of 2nd level loops b43: Fix lockdep splat on module unload Revert "ip_gre: propogate target device GSO capability to the tunnel device" IP_GRE: Fix GRE_CSUM case. VXLAN: Use tunnel_ip_select_ident() for tunnel IP-Identification. IP_GRE: Fix IP-Identification. net/pasemi: Fix missing coding style vmxnet3: fix ethtool ring buffer size setting vmxnet3: make local function static bnx2x: remove dead code and make local funcs static gianfar: fix compile fail for NET_POLL=y due to struct packing vlan: adjust vlan_set_encap_proto() for its callers sock_diag: Simplify sock_diag_handlers[] handling in __sock_diag_rcv_msg sock_diag: Fix out-of-bounds access to sock_diag_handlers[] vxlan: remove depends on CONFIG_EXPERIMENTAL mlx4_en: fix allocation of CPU affinity reverse-map mlx4_en: fix allocation of device tx_cq ...	2013-02-26 11:44:11 -08:00
Linus Torvalds	70a3a06d01	Main batch of InfiniBand/RDMA changes for 3.9: - SRP error handling fixes from Bart Van Assche - Implementation of memory windows for mlx4 from Shani Michaeli - Lots of cxgb4 HW driver fixes from Vipul Pandya - Make iSER work for virtual functions, other fixes from Or Gerlitz - Fix for bug in qib HW driver from Mike Marciniszyn - IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman - Various cleanups and warning fixes from Julia Lawall, Paul Bolle, Wei Yongjun -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJRLPNoAAoJEENa44ZhAt0h0bMP/1xqlgDR3DGdoUoV4yTd6O9G Uccdb7og5o5tedVo+8xZz01y4at99P8FWW4hJ4os6k8n6yKoBVwo7qjN+BOR+JG5 q8+O+ynUSIg4tGrb5sXcMnKXAXbw/vkftMWYNA41cbrM24DTYzB/2SLpvhwbFoTT tdQc2tgz5QaDqzWbagyCR4+k/IgO+Llrz/RvIdtz4dsTnTDogN7QCoSffX8n/Lpb DxtyXK4sdl3DAtd3CsIdsB/TSMb3RkHLCoSvmrWlLnqMdsbRxVnCVfBm4BOghW3J Y2K3joRoCjjIZSRNs/i0FMFkT/jbCXg1oXg9ek/a6YFNcgyk7z8iGyXrRY7fOnno 8U2SfxJ69YpVYeJr+DSjaeHcmjsaYU7NN7JPxzvPKcJOIsxQJ/euJDXAXau3lEQY o9/p4JsGty0WHi1NanyygvghvBAoP1C5/59Sl4bHH5gckPyJT1kinPSCTT76YXGS WkSHg2mDhiJHy7Pnuy85iZldPoy2/5z09/I4aGMeL+8kUZbD4iFqzXIJU0HTsAim EONoRXDhIcN5DNVSVH1ig6nJ2a7Vhov4Z0r/vB8P4KhslBcqFwf2leC0eCoe5mNt SzcKhqosZDXoL8AwzpntzGIOid8pWmHbUx/PgIcoVXPjtl0h2ULNIFoYYyMZ3cyU AyN2tSiUZVddTV1/aKGL =RAQw -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband update from Roland Dreier: "Main batch of InfiniBand/RDMA changes for 3.9: - SRP error handling fixes from Bart Van Assche - Implementation of memory windows for mlx4 from Shani Michaeli - Lots of cxgb4 HW driver fixes from Vipul Pandya - Make iSER work for virtual functions, other fixes from Or Gerlitz - Fix for bug in qib HW driver from Mike Marciniszyn - IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman - Various cleanups and warning fixes from Julia Lawall, Paul Bolle, Wei Yongjun" * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (41 commits) IB/mlx4: Advertise MW support IB/mlx4: Support memory window binding mlx4: Implement memory windows allocation and deallocation mlx4_core: Enable memory windows in {INIT, QUERY}_HCA mlx4_core: Disable memory windows for virtual functions IPoIB: Free ipoib neigh on path record failure so path rec queries are retried IB/srp: Fail I/O requests if the transport is offline IB/srp: Avoid endless SCSI error handling loop IB/srp: Avoid sending a task management function needlessly IB/srp: Track connection state properly IB/mlx4: Remove redundant NULL check before kfree IB/mlx4: Fix compiler warning about uninitialized 'vlan' variable IB/mlx4: Convert is_xxx variables in build_mlx_header() to bool IB/iser: Enable iser when FMRs are not supported IB/iser: Avoid error prints on EAGAIN registration failures IB/iser: Use proper define for the commands per LUN value advertised to SCSI ML IB/uverbs: Implement memory windows support in uverbs IB/core: Add "type 2" memory windows support mlx4_core: Propagate MR deregistration failures to caller mlx4_core: Rename MPT-related functions to have mpt_ prefix ...	2013-02-26 11:41:08 -08:00
Shani Michaeli	804d6a89a5	mlx4: Implement memory windows allocation and deallocation Implement MW allocation and deallocation in mlx4_core and mlx4_ib. Pass down the enable bind flag when registering memory regions. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-25 10:44:32 -08:00
Shani Michaeli	e448834e35	mlx4_core: Enable memory windows in {INIT, QUERY}_HCA Add memory windows-related code to INIT_HCA and QUERY_HCA. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-25 10:44:31 -08:00
Shani Michaeli	cc1ade94ee	mlx4_core: Disable memory windows for virtual functions Do not enable memory windows allocation for virtual functions. In addition, add a few safety checks, such as: * Verifying the PD of a new MPT matches the VF. * Making sure binding memory window isn't enabled for FMRs, and that new memory windows are not FMR themselves. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-25 10:44:31 -08:00
Kleber Sacilotto de Souza	3770699675	mlx4_en: fix allocation of CPU affinity reverse-map The mlx4_en driver allocates the number of objects for the CPU affinity reverse-map based on the number of rx rings of the device. However, mlx4_assign_eq() calls irq_cpu_rmap_add() as many times as IRQ's are assigned to EQ's, which can be as large as mlx4_dev->caps.comp_pool. If caps.comp_pool is larger than rx_ring_num we will eventually hit the BUG_ON() in cpu_rmap_add(). Fix this problem by allocating space for the maximum number of CPU affinity reverse-map objects we might want to add. Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Acked-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-23 13:51:54 -05:00
Kleber Sacilotto de Souza	427a96252d	mlx4_en: fix allocation of device tx_cq The memory to hold the network device tx_cq is not being allocated with the correct size in mlx4_en_init_netdev(). It should use MAX_TX_RINGS instead of MAX_RX_RINGS. This can cause problems if the number of tx rings being used is greater than MAX_RX_RINGS. Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Acked-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-23 13:51:54 -05:00
Linus Torvalds	9afa3195b9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "Assorted tiny fixes queued in trivial tree" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (22 commits) DocBook: update EXPORT_SYMBOL entry to point at export.h Documentation: update top level 00-INDEX file with new additions ARM: at91/ide: remove unsused at91-ide Kconfig entry percpu_counter.h: comment code for better readability x86, efi: fix comment typo in head_32.S IB: cxgb3: delay freeing mem untill entirely done with it net: mvneta: remove unneeded version.h include time: x86: report_lost_ticks doesn't exist any more pcmcia: avoid static analysis complaint about use-after-free fs/jfs: Fix typo in comment : 'how may' -> 'how many' of: add missing documentation for of_platform_populate() btrfs: remove unnecessary cur_trans set before goto loop in join_transaction sound: soc: Fix typo in sound/codecs treewide: Fix typo in various drivers btrfs: fix comment typos Update ibmvscsi module name in Kconfig. powerpc: fix typo (utilties -> utilities) of: fix spelling mistake in comment h8300: Fix home page URL in h8300/README xtensa: Fix home page URL in Kconfig ...	2013-02-21 17:40:58 -08:00
Shani Michaeli	6108372070	mlx4_core: Propagate MR deregistration failures to caller MR deregistration fails when memory windows are bound to the MR. Handle such failures by propagating them to the caller ULP. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-21 11:38:43 -08:00
Shani Michaeli	b20e519a81	mlx4_core: Rename MPT-related functions to have mpt_ prefix The MPT - Memory Protection Table - is used by both memory windows and memory regions. Hence, all MPT references are relevant for both types of memory objects. Rename the relevant functions to start with mpt_ instead of the current mr_ prefix. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-21 11:37:23 -08:00
Vlad Yasevich	1690be63a2	bridge: Add vlan support to static neighbors When a user adds bridge neighbors, allow him to specify VLAN id. If the VLAN id is not specified, the neighbor will be added for VLANs currently in the ports filter list. If no VLANs are configured on the port, we use vlan 0 and only add 1 entry. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-13 19:42:16 -05:00
David S. Miller	fd5023111c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Synchronize with 'net' in order to sort out some l2tp, wireless, and ipv6 GRE fixes that will be built on top of in 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-08 18:02:14 -05:00
Joe Perches	14f8dc4953	drivers: net: Remove remaining alloc/OOM messages alloc failures already get standardized OOM messages and a dump_stack. For the affected mallocs around these OOM messages: Converted kmallocs with multiplies to kmalloc_array. Converted a kmalloc/memcpy to kmemdup. Removed now unused stack variables. Removed unnecessary parentheses. Neatened alignment. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Arend van Spriel <arend@broadcom.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-08 17:44:39 -05:00
Tom Herbert	41b749201b	mlx4_en: Fix BQL reset TX queue call point Fix issue in Mellanox driver related to BQL. netdev_tx_reset_queue was not being called in certain situations where the device was being start and stopped. Moved netdev_tx_reset_queue from the reset device path to mlx4_en_free_tx_buf which is where the rings are cleaned in a reset (specifically from device being stopped). Signed-off-by: Tom Herbert <therbert@google.com> Acked-By: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:33:51 -05:00
Yan Burman	0ccddcd1c2	net/mlx4_en: Implement ndo fdb functionality Add support for setting embedded switch fdb in case of SRIOV, by implementing ndo_fdb_{add, del, dump}. This will allow to use bridged configuration with multi-function. In order to add VM MAC to the eSwitch fdb, the following command may be used over the relevant function interface: bridge fdb add <MAC> permanent self dev <IFACE> Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:26:13 -05:00
Yan Burman	cc5387f734	net/mlx4_en: Add unicast MAC filtering Implement and advertise unicast MAC filtering, such that setting macvlan instance over mlx4_en interfaces will not require the networking core to put mlx4_en devices in promiscuous mode. If for some reason adding a unicast address filter fails e.g as of missing space in the HW mac table, the device forces itself into promiscuous mode (and out of this forced state when enough space is available). Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:26:13 -05:00
Yan Burman	c07cb4b0ab	net/mlx4_en: Manage hash of MAC addresses per port As a preparation step for supporting multiple unicast addresses, store MAC addresses in hash table. Remove the radix tree for MAC addresses per QP, as it's not in use. Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:26:13 -05:00
Yan Burman	90bbb74af6	net/mlx4_en: Save previous MAC address of the port so we can replace it later In preparation to having more than one unicast MAC per port, we need to keep track of the previous MAC address in the flow of ndo_set_mac_address, so that mlx4_en_replace_mac will know what to replace. Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:26:13 -05:00
Yan Burman	0eb74fdda4	net/mlx4_en: Re-arrange ndo_set_rx_mode related code Currently, mlx4_en_do_set_multicast serves as the ndo_set_rx_mode entry for mlx4_en, doing all related work. Split it to few calls, one per required functionality (e.g multicast, promiscuous, etc) and rename some structures and calls to use rx_mode notation instead of multicast. Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:26:13 -05:00
Yan Burman	16a10ffd20	net/mlx4: Move Ethernet related functionality from mlx4_core to mlx4_en Move low level code that deals with management of Ethernet MACs and QPs from mlx4_core to mlx4_en. Also convert the new functions to deal with MACs in form of char array instead of u64. Actual functions moved: mlx4_replace_mac mlx4_get_eth_qp mlx4_put_eth_qp To conduct this change, some functionality had to be exported from the core, the following functions were added: mlx4_get_base_qp __mlx4_replace_mac (low level function for CX1/A0 compatibility) Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:26:12 -05:00
Yan Burman	48e551ff3d	net/mlx4_en: Cleanup multiline strings Make the code consistent in regard to error messages not spanning multiple lines. Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:26:12 -05:00
Yan Burman	6bbb6d99f3	net/mlx4_en: Optimize Rx fast path filter checks Currently, RX path code that does RX filtering is not optimized and does an expensive conversion. In order to use ether_addr_equal_64bits which is optimized for such cases, we need the MAC address kept by the device to be in the form of unsigned char array instead of u64. Store the MAC address as unsigned char array and convert to/from u64 out of the fast path when needed. Side effect of this is that we no longer need priv->mac, since it's the same as dev->dev_addr. This optimization was suggested by Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:26:12 -05:00
Yan Burman	79aeaccd91	net/mlx4_en: Optimize loopback related checks in data path Currently there are relatively complex conditional checks in the fast path, for TX loopback enabling and resulting RX filter logic. Move elaborate if's out of data path, replace them with a single flag for each state and update that state from appropriate places. Also, in native (non SRIOV) mode and not in loopback or in selftest, there is no need to try and filter out packets that HW loopback-ed, as in native mode we do not loopback packets anymore. Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-07 23:26:12 -05:00
Linus Torvalds	bb5204c2eb	IB regression fixes for 3.8: - Fix mlx4 VFs not working on old guests because of 64B CQE changes - Fix ill-considered sparse fix for qib - Fix IPoIB crash due to skb double destruct introduced in 3.8-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABCAAGBQJREutcAAoJEENa44ZhAt0hjFwP/3SCQr/eboXyJV9GBlmbU9y2 X7t6JS1n9R5KxBj54XBL8ZA7qcaw7rQj8VgPC4qlMWTR1/fXHsrtiRtQU1VMBcBP Eh50iE5BEq4kpK93IYZZei+I7J/0O1Hpj1JwUuvGr7/hQltMWXItuPTlO4Ror5Kk Vvu/waQh9DDp1uQRSPbSqAhEx7cGbl27UT7BLPqszVla59GA8UOUcfit8I9CyTmk ySP2zrDC7JtPoOPYy6w32K4NSjp3KTR4EHWX0G3t/b0LvwEHARwQZ3RI4ZjNMqLl mtKfqaYjqCeSlaT6MAODlN0aTp38GFAU0RaGePL5GurxQExwGnVZfTRUJDkNGTGO vPDq6+L6XwPHgYTs1knafs3OT24nwv/vzZ/SLV7gcssbxdL8Cru16E4CO3Vpryrl 5B0w2+ld+L1lw/m4rSuqzQYpS6NpW35ATKzMhQNwk9cLCLNCOqv247WDvhBZDnpV lhLQ+RGs6DK7CQQ8w4rYLFBVk+1kPlZYILV0Rjni6vv7w9S/byVrshqE8eIkQwqE BEl0gMc83VZj5WH5s5MJEx+T5H2lZ80rIDKuamSz7wEduXWWENEqj5k7mBHa66Sn 0aHcrXDe26Cj1TUCGbrgeFPMlucVAK+fSjvEzZrzQwxLspnKXlFw5v0DvqmTqBok hO0iE4ajfXl9RfIC7KrK =bmKU -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull IB regression fixes from Roland Dreier: - Fix mlx4 VFs not working on old guests because of 64B CQE changes - Fix ill-considered sparse fix for qib - Fix IPoIB crash due to skb double destruct introduced in 3.8-rc1 * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/qib: Fix for broken sparse warning fix mlx4_core: Fix advertisement of wrong PF context behaviour IPoIB: Fix crash due to skb double destruct	2013-02-08 12:15:14 +11:00
Or Gerlitz	f97b4b5d46	mlx4_core: Fix advertisement of wrong PF context behaviour Commit `08ff32352d` ("mlx4: 64-byte CQE/EQE support") introduced a regression where older guest VF drivers failed to load even when 64-byte EQEs/CQEs are disabled, since the PF wrongly advertises the new context behaviour anyway. The failure looks like: mlx4_core 0000:00:07.0: Unknown pf context behaviour mlx4_core 0000:00:07.0: Failed to obtain slave caps mlx4_core: probe of 0000:00:07.0 failed with error -38 Fix this by basing this advertisement on dev->caps.flags, which is the operational capabilities used by the QUERY_FUNC_CAP command wrapper (dev_cap->flags holds the firmware capabilities). Reported-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2013-02-05 09:35:41 -08:00
Hadar Hen Zion	f9d96862ca	net/mlx4_en: Fix compilation error when CONFIG_INET isn't defined ip_eth_mc_map function can't be used when CONFIG_INET isn't defined. Fixed compilation error by adding CONFIG_INET define check before using the function. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-04 13:26:50 -05:00
Hadar Hen Zion	377d97393d	net/mlx4_en: Fix error propagation for ethtool helper function Propagate return value of mlx4_en_ethtool_add_mac_rule_by_ipv4 in case of failure. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-04 13:26:50 -05:00
Joe Perches	b2adaca92c	ethernet: Remove unnecessary alloc/OOM messages, alloc cleanups alloc failures already get standardized OOM messages and a dump_stack. Convert kzalloc's with multiplies to kcalloc. Convert kmalloc's with multiplies to kmalloc_array. Fix a few whitespace defects. Convert a constant 6 to ETH_ALEN. Use parentheses around sizeof. Convert vmalloc/memset to vzalloc. Remove now unused size variables. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-02-04 13:22:33 -05:00
Amir Vadai	3484aac161	net/mlx4_en: Fix transmit timeout when driver restarts port Under heavy CPU load, changing, ring size/mtu/etc. could result in transmit timeout, since stop-start port might take more than 10 seconds. Calling netif_detach_device to prevent tx queue transmit timeout. netif_detach_device() is not called under ndo_stop, because netif_carrier_off will prevent the timeout, and device should not be marked as not present, or else user won't be able to start it later on. CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:48 -05:00
Matan Barak	955154fa33	net/mlx4_en: Don't reassign port mac address on firmware that supports it Mac reassignments should only be done when not supported by the firmware. To accomplish that, checking firmware capability bit to know whether we should reassign macs in the driver. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:47 -05:00
Hadar Hen Zion	23537b732f	net/mlx4_core: Use firmware driven flow steering hash mode The Firmware dynamically changes flow steering hash configuration from covering L2 only to "full" L2/L3/L4 mode needed. The dynamic change allows the driver to set hard coded hash configuration which is changed by the firmware from L2 to L2/L3/L4 when attaching the first L3/L4 flow steering rule and back to L2 when there are no more such rules. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:47 -05:00
Hadar Hen Zion	0d256c0e93	net/mlx4_en: Fix ethtool rules leftovers after module unloaded As part of the driver unload flow, all steering rules must be deleted, make sure to remove the rules that were set through ethtool. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:47 -05:00
Hadar Hen Zion	280fce1e3e	net/mlx4_en: Block insertion of ethtool steering rules while the interface is down Attaching steering rules while the interface is down is an invalid operation, block it. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:47 -05:00
Hadar Hen Zion	8258bd2713	net/mlx4_en: Fix vlan mask for ethtool steering rules The vlan mask field should be validated and assigned according to the field size which is 12 bits. Also replace the numeric 0xfff mask with existing kernel macro. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:47 -05:00
Hadar Hen Zion	69d7126b7f	net/mlx4_en: Validate VLAN IDs provided in ethtool flow steering rules When attaching flow steering rules via Ethtool accept only valid vlans IDs e.g in the range: [0,4095]. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:46 -05:00
Hadar Hen Zion	f90a36734a	net/mlx4_en: Fix ip/udp steering rules multicast mac when attached via ethtool Destination mac is a mandatory specification for ip/udp steering rules. When attaching multicast steering rules via ethtool the unicast mac of the interface was added to the rule specification instead of the multicast mac. The following commit sets the corresponding multicast mac for the rule multicast ip. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:46 -05:00
Hadar Hen Zion	248c62aa12	net/mlx4_core: Set correctly allow_loopback flag The allow_loopback flag was wrongly set using arithmetic bit operation, change the code to use logical bit operation. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:46 -05:00
Hadar Hen Zion	015465f851	net/mlx4_core: Directly expose fields of HW flow steering rule control segment Some of the fields for struct mlx4_net_trans_rule_hw_ctrl were packed into u32 and accessed through bit field operations. Expose and access them directly as u8. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-31 12:48:46 -05:00
David S. Miller	f1e7b73acc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Bring in the 'net' tree so that we can get some ipv4/ipv6 bug fixes that some net-next work will build upon. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-29 15:32:13 -05:00
Jiri Kosina	617677295b	Merge branch 'master' into for-next Conflicts: drivers/devfreq/exynos4_bus.c Sync with Linus' tree to be able to apply patches that are against newer code (mvneta).	2013-01-29 10:48:30 +01:00
Amir Vadai	78fb2de711	net/mlx4_en: Initialize RFS filters lock and list in init_netdev filters_lock might have been used while it was re-initialized. Moved filters_lock and filters_list initialization to init_netdev instead of alloc_resources which is called every time the device is configured. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-28 00:14:28 -05:00
Amir Vadai	7225922558	net/mlx4_en: Fix a race when closing TX queue There is a possible race where the TX completion handler can clean the entire TX queue between the decision that the queue is full and actually closing it. To avoid this situation, check again if the queue is really full, if not, reopen the transmit and continue with sending the packet. CC: Eric Dumazet <edumazet@google.com> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-28 00:14:24 -05:00
Jack Morgenstein	f356fcbe12	net/mlx4_core: Return proper error code when __mlx4_add_one fails Returning 0 (success) when in fact we are aborting the load, leads to kernel panic when unloading the module. Fix that by returning the actual error code. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-28 00:13:57 -05:00
Eugenia Emantayev	dbd501a806	net/mlx4_en: Use the correct netif lock on ndo_set_rx_mode The device multicast list is protected by netif_addr_lock_bh in the networking core, we should use this locking practice in mlx4_en too. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-28 00:13:56 -05:00
Aviad Yehezkel	db0e7cba6d	net/mlx4_en: Fix traffic loss under promiscuous mode When port is stopped and flow steering mode is not device managed: promisc QP rule wasn't removed from MCG table. Added code to remove it in all flow steering modes. In addition, promsic rule removal should be in stop port and not in start port - moved it accordingly. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-28 00:13:56 -05:00
Eugenia Emantayev	2d51837fa1	net/mlx4_en: Issue the dump eth statistics command under lock Performing the DUMP_ETH_STATS firmware command outside the lock leads to kernel panic when data structures such as RX/TX rings are freed in parallel, e.g when one changes the mtu or ring sizes. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-28 00:13:56 -05:00
Eric Dumazet	546bfedbe6	net/mlx4_en: remove redundant code remove redundant code from build_inline_wqe() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-By: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-20 23:10:52 -05:00
Or Gerlitz	ca4c7b35f7	net/mlx4_core: Set number of msix vectors under SRIOV mode to firmware defaults The lines if (mlx4_is_mfunc(dev)) { nreq = 2; } else { which hard code the number of requested msi-x vectors under multi-function mode to two can be removed completely, since the firmware sets num_eqs and reserved_eqs appropriately Thus, the code line: nreq = min_t(int, dev->caps.num_eqs - dev->caps.reserved_eqs, nreq); is by itself sufficient and correct for all cases. Currently, for mfunc mode num_eqs = 32 and reserved_eqs = 28, hence four vectors will be enabled. This triples (one vector is used for the async events and commands EQ) the horse power provided for processing of incoming packets on netdev RSS scheme, IO initiators/targets commands processing flows, etc. Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-18 14:25:28 -05:00
Yan Burman	213815a1e6	net/mlx4_en: Fix bridged vSwitch configuration for non SRIOV mode Commit `5b4c4d3686` "mlx4_en: Allow communication between functions on same host" introduced a regression under which a bridge acting as vSwitch whose uplink is an mlx4 Ethernet device become non-operative in native (non sriov) mode. This happens since broadcast ARP requests sent by VMs were loopback-ed by the HW and hence the bridge learned VM source MACs on both the VM and the uplink ports. The fix is to place the DMAC in the send WQE only under SRIOV/eSwitch configuration or when the device is in selftest. Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-18 14:25:28 -05:00
Jiri Pirko	aaeb6cdfa5	remove init of dev->perm_addr in drivers perm_addr is initialized correctly in register_netdevice() so to init it in drivers is no longer needed. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-01-08 18:00:48 -08:00
Jorrit Schippers	d82603c6da	treewide: Replace incomming with incoming in all comments and strings Signed-off-by: Jorrit Schippers <jorrit@ncode.nl> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2013-01-03 16:15:49 +01:00
Jack Morgenstein	3c439b5586	mlx4_core: Allow choosing flow steering mode Device managed flow steering will be enabled only under administrator directive provided through setting the existing module parameter log_num_mgm_entry_size to -1 (if the device actually supports flow steering). If flow steering isn't requested or not available, the driver will use the value of log_num_mgm_entry_size and B0 steering. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-19 11:47:22 -08:00
Jack Morgenstein	7b8157bedc	mlx4_core: Adjustments to Flow Steering activation logic for SR-IOV Separate flow steering capability detection from the decision to activate. For the master (and for native), detect the flow steering capability in mlx4_dev_cap, but activate the appropriate steering type in a new function choose_flow_steering() based on detected data. For VFs, activate flow steering based on what was actually activated by the master, where that info is obtained via QUERY_HCA. This fixes the current VF detection which is wrongly based on QUERY_DEV_CAP. Also, for SR-IOV mode, if flow steering may be activated, do so only if the max number of QPs per rule is sufficient to satisfy one subscription per VF. If not, fall back to B0 mode. This is needed to serve registrations done by L2 network drivers such as mlx4_en and IPoIB when the network stack attempts to join to multicast groups such as all-hosts or the IPoIB broadcast group. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-19 11:47:14 -08:00
Hadar Hen Zion	2065b38bc2	mlx4_core: Fix error flow in the flow steering wrapper The error flow of the flow steering wrapper had a typo which caused the wrong firmware command to be called, fix it. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-19 09:44:00 -08:00
Hadar Hen Zion	a9c01e7ac1	mlx4_core: Add QPN enforcement for flow steering rules set by VFs Since VFs may be mapped to VMs which aren't trusted entities, flow steering rules attached through the wrapper on behalf of VFs must be checked to make sure that the specified QP number is assigned to that VF. Also, make sure to keep the QP busy till the end of the operation from the resource tracker point of view. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-12-19 09:43:59 -08:00
Linus Torvalds	f132c54e3a	First batch of InfiniBand/RDMA changes for the 3.8 merge window: - A good chunk of Bart Van Assche's SRP fixes - UAPI disintegration from David Howells - mlx4 support for "64-byte CQE" hardware feature from Or Gerlitz - Other miscellaneous fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJQxstjAAoJEENa44ZhAt0hURUQAJd7HumReKTdRqzIzXPc+rgl pRR5eqplPY2anfJMqLDiFphVjfCiKyhudomdo+RUbBFFnUVLlBzk80A0/IZ3g3PZ MHOT+pX4PGDd+3FQxV2AaQCMwgGbvC0haInXyQDVZGm0fbMjRd699yGVWBiA8rOI VNhUi5WMmynSINYokM8UxrhfoUfy3QxsOvZBZ3XUD1zjJB0IMd5HRdiDUG7ur0q+ rfpWKv51DXT81ux36MXbdPBhLRbzx4B7EwuPWOFPqJe1KwK2cD8iA6DwEKC9KMxS Kj2+CxB5Bfpfz8bhLi2VZcMgAKiSIQDXUtiKz8h0yFVhvADYZLU7zdGN49mCqKcY 9dwX8+0aIVez6WB2jH+ir2FSG65NsnvqESwQ4LLQ9bhArgf9fapVGlypHwcKi5hh 3j2ipO/RyT56nLQeI0gz1P5mQneFSWlY96CD8WP+9OxO/mVnxViajzevSwT/cLE6 IOMks8DPhsQK88JXSx0XKVxn3zrJ9SXbYDhRWJ6f4w/fxraRXlFdQi0UfcsAajkX 5qmM4e8Oy97TJYiY1RkAmb7aV182xMWVjtDx2FFTQ5ukgDea/DklIM/JNQ475027 N7zMW1tP6+gnnDyMEkteVuPdbl1fzwI3RdXCh0mFZHZ5tvegkdxbw0XxERcevnQN LZfME8wCuC7+RtmE38Li =TQK2 -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband upate from Roland Dreier: "First batch of InfiniBand/RDMA changes for the 3.8 merge window: - A good chunk of Bart Van Assche's SRP fixes - UAPI disintegration from David Howells - mlx4 support for "64-byte CQE" hardware feature from Or Gerlitz - Other miscellaneous fixes" Fix up trivial conflict in mellanox/mlx4 driver. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (33 commits) RDMA/nes: Fix for crash when registering zero length MR for CQ RDMA/nes: Fix for terminate timer crash RDMA/nes: Fix for BUG_ON due to adding already-pending timer IB/srp: Allow SRP disconnect through sysfs srp_transport: Document sysfs attributes srp_transport: Simplify attribute initialization code srp_transport: Fix attribute registration IB/srp: Document sysfs attributes IB/srp: send disconnect request without waiting for CM timewait exit IB/srp: destroy and recreate QP and CQs when reconnecting IB/srp: Eliminate state SRP_TARGET_DEAD IB/srp: Introduce the helper function srp_remove_target() IB/srp: Suppress superfluous error messages IB/srp: Process all error completions IB/srp: Introduce srp_handle_qp_err() IB/srp: Simplify SCSI error handling IB/srp: Keep processing commands during host removal IB/srp: Eliminate state SRP_TARGET_CONNECTING IB/srp: Increase block layer timeout RDMA/cm: Change return value from find_gid_port() ...	2012-12-13 19:19:09 -08:00
Linus Torvalds	a2013a13e6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial branch from Jiri Kosina: "Usual stuff -- comment/printk typo fixes, documentation updates, dead code elimination." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) HOWTO: fix double words typo x86 mtrr: fix comment typo in mtrr_bp_init propagate name change to comments in kernel source doc: Update the name of profiling based on sysfs treewide: Fix typos in various drivers treewide: Fix typos in various Kconfig wireless: mwifiex: Fix typo in wireless/mwifiex driver messages: i2o: Fix typo in messages/i2o scripts/kernel-doc: check that non-void fcts describe their return value Kernel-doc: Convention: Use a "Return" section to describe return values radeon: Fix typo and copy/paste error in comments doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c various: Fix spelling of "asynchronous" in comments. Fix misspellings of "whether" in comments. eisa: Fix spelling of "asynchronous". various: Fix spelling of "registered" in comments. doc: fix quite a few typos within Documentation target: iscsi: fix comment typos in target/iscsi drivers treewide: fix typo of "suport" in various comments and Kconfig treewide: fix typo of "suppport" in various comments ...	2012-12-13 12:00:02 -08:00
Yan Burman	520dfe3a36	net/mlx4_en: Add support for destination MAC in steering rules Implement destination MAC rule extension for L3/L4 rules in flow steering. Usefull for vSwitch/macvlan configurations. Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-12 13:02:30 -05:00
Yan Burman	c402b9477b	net/mlx4_en: Use generic etherdevice.h functions. Get rid of full_mac, zero_mac in favour of is_zero_ether_addr and is_broadcast_ether_addr. Signed-off-by: Yan Burman <yanb@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-12 13:02:30 -05:00
Greg Kroah-Hartman	1dd06ae8db	drivers/net: fix up function prototypes after __dev* removals The __dev* removal patches for the network drivers ended up messing up the function prototypes for a bunch of drivers. This patch fixes all of them back up to be properly aligned. Bonus is that this almost removes 100 lines of code, always a nice surprise. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-07 14:22:22 -05:00
Bill Pemberton	f57e68488e	mlx4_core: remove __dev* attributes CONFIG_HOTPLUG is going away as an option. As result the __dev* markings will be going away. Remove use of __devinit, __devexit_p, __devinitdata, __devinitconst, and __devexit. Signed-off-by: Bill Pemberton <wfp5p@virginia.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-12-03 11:16:44 -08:00
Amir Vadai	d317966bd3	net/mlx4_en: Set number of rx/tx channels using ethtool Add support to changing number of rx/tx channels using ethtool ('ethtool -[lL]'). Where the number of tx channels specified in ethtool is the number of rings per user priority - not total number of tx rings. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-02 20:22:59 -05:00
Amir Vadai	79c54b6bbf	net/mlx4_en: Fix TX moderation info loss after set_ringparam is called We need to re-set tx moderation information after calling set_ringparam else default tx moderation will be used. Also avoid related code duplication, by putting it in a utility function. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-12-02 20:22:58 -05:00
Jack Morgenstein	311f813a2d	mlx4_core: Fix potential deadlock in mlx4_eq_int() The slave_state_lock spinlock is used in both interrupt context and process context, hence irq locking must be used. Found by lockdep. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-29 12:14:54 -08:00
David S. Miller	8a2cf062b2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-29 12:51:17 -05:00
Amir Vadai	29bb8f4a8d	net/mlx4_en: Can set maxrate only for TC0 Had a typo in memcpy. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-28 11:15:32 -05:00
Or Gerlitz	08ff32352d	mlx4: 64-byte CQE/EQE support ConnectX-3 devices can use either 64- or 32-byte completion queue entries (CQEs) and event queue entries (EQEs). Using 64-byte EQEs/CQEs performs better because each entry is aligned to a complete cacheline. This patch queries the HCA's capabilities, and if it supports 64-byte CQEs and EQES the driver will configure the HW to work in 64-byte mode. The 32-byte vs 64-byte mode is global per HCA and not per CQ or EQ. Since this mode is global, userspace (libmlx4) must be updated to work with the configured CQE size, and guests using SR-IOV virtual functions need to know both EQE and CQE size. In case one of the 64-byte CQE/EQE capabilities is activated, the patch makes sure that older guest drivers that use the QUERY_DEV_FUNC command (e.g as done in mlx4_core of Linux 3.3..3.6) will notice that they need an update to be able to work with the PPF. This is done by changing the returned pf_context_behaviour not to be zero any more. In case none of these capabilities is activated that value remains zero and older guest drivers can run OK. The SRIOV related flow is as follows 1. the PPF does the detection of the new capabilities using QUERY_DEV_CAP command. 2. the PPF activates the new capabilities using INIT_HCA. 3. the VF detects if the PPF activated the capabilities using QUERY_HCA, and if this is the case activates them for itself too. Note that the VF detects that it must be aware to the new PF behaviour using QUERY_FUNC_CAP. Steps 1 and 2 apply also for native mode. User space notification is done through a new field introduced in struct mlx4_ib_ucontext which holds device capabilities for which user space must take action. This changes the binary interface so the ABI towards libmlx4 exposed through uverbs is bumped from 3 to 4 but only when needed i.e. only when the driver does use 64-byte CQEs or future device capabilities which must be in sync by user space. This practice allows to work with unmodified libmlx4 on older devices (e.g A0, B0) which don't support 64-byte CQEs. In order to keep existing systems functional when they update to a newer kernel that contains these changes in VF and userspace ABI, a module parameter enable_64b_cqe_eqe must be set to enable 64-byte mode; the default is currently false. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-11-26 10:19:17 -08:00
Ben Hutchings	ff33c0e188	net: Remove bogus dependencies on INET Various drivers depend on INET because they used to select INET_LRO, but they have all been converted to use GRO which has no such dependency. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-19 19:13:59 -05:00
Ben Hutchings	f1d29a3fa6	mlx4_en: Remove remnants of LRO support Commit `fa37a9586f` ('mlx4_en: Moving to work with GRO') left behind the Kconfig depends/select, some dead code and comments referring to LRO. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-19 19:13:59 -05:00
Adam Buchbinder	b3834be5c4	various: Fix spelling of "asynchronous" in comments. "Asynchronous" is misspelled in some comments. No code changes. Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-11-19 14:32:13 +01:00
David S. Miller	d4185bbf62	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c Minor conflict between the BCM_CNIC define removal in net-next and a bug fix added to net. Based upon a conflict resolution patch posted by Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-10 18:32:51 -05:00
Eric Dumazet	ecfd2ce1a9	mlx4: change TX coalescing defaults mlx4 currently uses a too high tx coalescing setting, deferring TX completion interrupts by up to 128 us. With the recent skb_orphan() removal in commit `8112ec3b87`, performance of a single TCP flow is capped to ~4 Gbps, unless we increase tcp_limit_output_bytes. I suggest using 16 us instead of 128 us, allowing a finer control. Performance of a single TCP flow is restored to previous levels, while keeping TCP small queues fully enabled with default sysctl. This patch is also a BQL prereq. Reported-by: Vimalkumar <j.vimal@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yevgeny Petrilin <yevgenyp@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-07 15:30:19 -05:00
Linus Torvalds	e657e078d3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "This is what we usually expect at this stage of the game, lots of little things, mostly in drivers. With the occasional 'oops didn't mean to do that' kind of regressions in the core code." 1) Uninitialized data in __ip_vs_get_timeouts(), from Arnd Bergmann 2) Reject invalid ACK sequences in Fast Open sockets, from Jerry Chu. 3) Lost error code on return from _rtl_usb_receive(), from Christian Lamparter. 4) Fix reset resume on USB rt2x00, from Stanislaw Gruszka. 5) Release resources on error in pch_gbe driver, from Veaceslav Falico. 6) Default hop limit not set correctly in ip6_template_metrics[], fix from Li RongQing. 7) Gianfar PTP code requests wrong kind of resource during probe, fix from Wei Yang. 8) Fix VHOST net driver on big-endian, from Michael S Tsirkin. 9) Mallenox driver bug fixes from Jack Morgenstein, Or Gerlitz, Moni Shoua, Dotan Barak, and Uri Habusha. 10) usbnet leaks memory on TX path, fix from Hemant Kumar. 11) Use socket state test, rather than presence of FIN bit packet, to determine FIONREAD/SIOCINQ value. Fix from Eric Dumazet. 12) Fix cxgb4 build failure, from Vipul Pandya. 13) Provide a SYN_DATA_ACKED state to complement SYN_FASTOPEN in socket info dumps. From Yuchung Cheng. 14) Fix leak of security path in kfree_skb_partial(). Fix from Eric Dumazet. 15) Handle RX FIFO overflows more resiliently in pch_gbe driver, from Veaceslav Falico. 16) Fix MAINTAINERS file pattern for networking drivers, from Jean Delvare. 17) Add iPhone5 IDs to IPHETH driver, from Jay Purohit. 18) VLAN device type change restriction is too strict, and should not trigger for the automatically generated vlan0 device. Fix from Jiri Pirko. 19) Make PMTU/redirect flushing work properly again in ipv4, from Steffen Klassert. 20) Fix memory corruptions by using kfree_rcu() in netlink_release(). From Eric Dumazet. 21) More qmi_wwan device IDs, from Bjørn Mork. 22) Fix unintentional change of SNAT/DNAT hooks in generic NAT infrastructure, from Elison Niven. 23) Fix 3.6.x regression in xt_TEE netfilter module, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits) tilegx: fix some issues in the SW TSO support qmi_wwan/cdc_ether: move Novatel 551 and E362 to qmi_wwan net: usb: Fix memory leak on Tx data path net/mlx4_core: Unmap UAR also in the case of error flow net/mlx4_en: Don't use vlan tag value as an indication for vlan presence net/mlx4_en: Fix double-release-range in tx-rings bas_gigaset: fix pre_reset handling vhost: fix mergeable bufs on BE hosts gianfar_ptp: use iomem, not ioports resource tree in probe ipv6: Set default hoplimit as zero. NET_VENDOR_TI: make available for am33xx as well pch_gbe: fix error handling in pch_gbe_up() b43: Fix oops on unload when firmware not found mwifiex: clean up scan state on error mwifiex: return -EBUSY if specific scan request cannot be honored brcmfmac: fix potential NULL dereference Revert "ath9k_hw: Updated AR9003 tx gain table for 5GHz" ath9k_htc: Add PID/VID for a Ubiquiti WiFiStation rt2x00: usb: fix reset resume rtlwifi: pass rx setup error code to caller ...	2012-10-26 15:00:48 -07:00
Dotan Barak	bfc0d8c3de	net/mlx4_core: Unmap UAR also in the case of error flow If a failure takes place during the EQ creation, we need to unmap the UAR memory block too. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Uri Habusha <urih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-26 03:34:15 -04:00
Moni Shoua	2b39a06198	net/mlx4_en: Don't use vlan tag value as an indication for vlan presence The vlan tag can be zero. This is why it can't serve as an indication that packet requires VLAN header in the TX flow. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-26 03:34:15 -04:00
Jack Morgenstein	7208ca3007	net/mlx4_en: Fix double-release-range in tx-rings The QP range is reserved as a single block. However, when freeing the en resources, the tx-ring QPs are released both in mlx4_en_destroy_tx_ring (one at a time) and in mlx4_en_free_resources (as a block release). Fix by eliminating the one-at-a-time release in mlx4_en_destroy_tx_ring. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-26 03:34:15 -04:00
Dotan Barak	41929ed265	mlx4_core: Perform correct resource cleanup if mlx4_QUERY_ADAPTER() fails Fixed the resource cleanup to act correctly and prevent a kernel oops when mlx4_QUERY_ADAPTER() fails. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-23 09:03:37 -07:00
Or Gerlitz	3cf164c8de	mlx4_core: Remove annoying debug messages from SR-IOV flow These debug prints left behind by commits `c82e9aa0a8` ("mlx4_core: resource tracking for HCA resources used by guests"), `54679e1482` ("mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoop") and `993c401e20` ("mlx4_core: Add IB port-state machine and port mgmt event propagation") make it pretty hard to actually use the mlx4_core debug messages when running in SRIOV/IB mode -- for example, the module load sequence of a device with one VF yielded 631 debug prints, with 408 of them being from this set. Let's just remove them. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-23 09:03:30 -07:00
Jack Morgenstein	60396683fe	mlx4_core: Adjust flow steering attach wrapper so that IB works on SR-IOV VFs Currently, the InfiniBand stack does not support flow steering at the verbs level -- the only usage of flow steering in the IB driver is for L2 multicast attaches. We need to add the IB case to procedure mlx4_QP_FLOW_STEERING_ATTACH_wrapper() to allow IPoIB to work on VFs on ConnectX-3 when flow steering is enabled. Currently, the IB case in mlx4_QP_FLOW_STEERING_ATTACH_wrapper() is missing, so the procedure returns -EINVAL and IPoIB on VFs breaks. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-03 14:11:20 -07:00
Linus Torvalds	7a9a2970b5	First batch of InfiniBand/RDMA changes for the 3.7 merge window: - mlx4 IB support for SR-IOV - A couple of SRP initiator fixes - Batch of nes hardware driver fixes - Fix for long-standing use-after-free crash in IPoIB - Other miscellaneous fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJQav4jAAoJEENa44ZhAt0hmL0QAJTuMdSOzYFd/NB38owJCNM2 kz/N1GlBm3z98fIlGo8u+lzgV2qxqZSAzsJsouMeK38KiAX3CL8HKe44A1QvTM6v dXTNL4JFX24/YF+nlmMY8Av518I9Mkte3BZCnpYkBjVFBWe0ePwoRC/btfBXPDIV 0snq4OtjoBAn00dOOyuZ5PoyY9xf0z4UB0Gple9sM4mzEb8wVWdNDDPOiuPJc6fA L+gk6HLkZDg54+QswafdKYwpeTq45wIKLmCdS3oUNmppMLVhZY8rECOwzSa+KiTr /Yo19n+zl+IBlvjQHhmUqGHvdD17PaGlr+TckAsQqmVfXUH5qqpEnkF8FoEK59c5 YA3lVU8Sj4BPhJ7qX54CuN3767mZizakkxCr9iPRzABFTgzWVgcSgCrE8jjx4i0h Pam+L5bmANFStgmGR8PmXiNgcrCUcEqYHsOWDDAnHa5ekb2nyv1JL1c18hlY9hC3 Xb1YTMZFwvofGza89hBu7oHrMbLOUc5kW2lBpvUn2nlyf3i0F8ISlVbVbNjFA54p 60/jHa2VOQ2CcJUJKnJOk4ajOOEfHnPtMn2q96XJ69Dp8+eSYEO/G+0i1OlChq4h ClnG0Yp+NkT1o8WXMd7guDR+RsXt+DXIij5TiUWRIqnIlopIsMTRhNH28tMu4jQL fgN5n987wru91ewdX4gW =PAcy -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband updates from Roland Dreier: "First batch of InfiniBand/RDMA changes for the 3.7 merge window: - mlx4 IB support for SR-IOV - A couple of SRP initiator fixes - Batch of nes hardware driver fixes - Fix for long-standing use-after-free crash in IPoIB - Other miscellaneous fixes" This merge also removes a new use of __cancel_delayed_work(), and replaces it with the regular cancel_delayed_work() that is now irq-safe thanks to the workqueue updates. That said, I suspect the sequence in question should probably use "mod_delayed_work()". I just did the minimal "don't use deprecated functions" fixup, though. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (45 commits) IB/qib: Fix local access validation for user MRs mlx4_core: Disable SENSE_PORT for multifunction devices mlx4_core: Clean up enabling of SENSE_PORT for older (ConnectX-1/-2) HCAs mlx4_core: Stash PCI ID driver_data in mlx4_priv structure IB/srp: Avoid having aborted requests hang IB/srp: Fix use-after-free in srp_reset_req() IB/qib: Add a qib driver version RDMA/nes: Fix compilation error when nes_debug is enabled RDMA/nes: Print hardware resource type RDMA/nes: Fix for crash when TX checksum offload is off RDMA/nes: Cosmetic changes RDMA/nes: Fix for incorrect MSS when TSO is on RDMA/nes: Fix incorrect resolving of the loopback MAC address mlx4_core: Fix crash on uninitialized priv->cmd.slave_sem mlx4_core: Trivial cleanups to driver log messages mlx4_core: Trivial readability fix: "0X30" -> "0x30" IB/mlx4: Create paravirt contexts for VFs when master IB driver initializes mlx4: Modify proxy/tunnel QP mechanism so that guests do no calculations mlx4: Paravirtualize Node Guids for slaves mlx4: Activate SR-IOV mode for IB ...	2012-10-02 17:20:40 -07:00
Linus Torvalds	aecdc33e11	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking changes from David Miller: 1) GRE now works over ipv6, from Dmitry Kozlov. 2) Make SCTP more network namespace aware, from Eric Biederman. 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko. 4) Make openvswitch network namespace aware, from Pravin B Shelar. 5) IPV6 NAT implementation, from Patrick McHardy. 6) Server side support for TCP Fast Open, from Jerry Chu and others. 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel Borkmann. 8) Increate the loopback default MTU to 64K, from Eric Dumazet. 9) Use a per-task rather than per-socket page fragment allocator for outgoing networking traffic. This benefits processes that have very many mostly idle sockets, which is quite common. From Eric Dumazet. 10) Use up to 32K for page fragment allocations, with fallbacks to smaller sizes when higher order page allocations fail. Benefits are a) less segments for driver to process b) less calls to page allocator c) less waste of space. From Eric Dumazet. 11) Allow GRO to be used on GRE tunnels, from Eric Dumazet. 12) VXLAN device driver, one way to handle VLAN issues such as the limitation of 4096 VLAN IDs yet still have some level of isolation. From Stephen Hemminger. 13) As usual there is a large boatload of driver changes, with the scale perhaps tilted towards the wireless side this time around. Fix up various fairly trivial conflicts, mostly caused by the user namespace changes. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits) hyperv: Add buffer for extended info after the RNDIS response message. hyperv: Report actual status in receive completion packet hyperv: Remove extra allocated space for recv_pkt_list elements hyperv: Fix page buffer handling in rndis_filter_send_request() hyperv: Fix the missing return value in rndis_filter_set_packet_filter() hyperv: Fix the max_xfer_size in RNDIS initialization vxlan: put UDP socket in correct namespace vxlan: Depend on CONFIG_INET sfc: Fix the reported priorities of different filter types sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP sfc: Fix loopback self-test with separate_tx_channels=1 sfc: Fix MCDI structure field lookup sfc: Add parentheses around use of bitfield macro arguments sfc: Fix null function pointer in efx_sriov_channel_type vxlan: virtual extensible lan igmp: export symbol ip_mc_leave_group netlink: add attributes to fdb interface tg3: unconditionally select HWMON support when tg3 is enabled. Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT" gre: fix sparse warning ...	2012-10-02 13:38:27 -07:00
Linus Torvalds	033d9959ed	Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue changes from Tejun Heo: "This is workqueue updates for v3.7-rc1. A lot of activities this round including considerable API and behavior cleanups. * delayed_work combines a timer and a work item. The handling of the timer part has always been a bit clunky leading to confusing cancelation API with weird corner-case behaviors. delayed_work is updated to use new IRQ safe timer and cancelation now works as expected. * Another deficiency of delayed_work was lack of the counterpart of mod_timer() which led to cancel+queue combinations or open-coded timer+work usages. mod_delayed_work[_on]() are added. These two delayed_work changes make delayed_work provide interface and behave like timer which is executed with process context. * A work item could be executed concurrently on multiple CPUs, which is rather unintuitive and made flush_work() behavior confusing and half-broken under certain circumstances. This problem doesn't exist for non-reentrant workqueues. While non-reentrancy check isn't free, the overhead is incurred only when a work item bounces across different CPUs and even in simulated pathological scenario the overhead isn't too high. All workqueues are made non-reentrant. This removes the distinction between flush_[delayed_]work() and flush_[delayed_]_work_sync(). The former is now as strong as the latter and the specified work item is guaranteed to have finished execution of any previous queueing on return. * In addition to the various bug fixes, Lai redid and simplified CPU hotplug handling significantly. * Joonsoo introduced system_highpri_wq and used it during CPU hotplug. There are two merge commits - one to pull in IRQ safe timer from tip/timers/core and the other to pull in CPU hotplug fixes from wq/for-3.6-fixes as Lai's hotplug restructuring depended on them." Fixed a number of trivial conflicts, but the more interesting conflicts were silent ones where the deprecated interfaces had been used by new code in the merge window, and thus didn't cause any real data conflicts. Tejun pointed out a few of them, I fixed a couple more. * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits) workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending() workqueue: use cwq_set_max_active() helper for workqueue_set_max_active() workqueue: introduce cwq_set_max_active() helper for thaw_workqueues() workqueue: remove @delayed from cwq_dec_nr_in_flight() workqueue: fix possible stall on try_to_grab_pending() of a delayed work item workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback() workqueue: use __cpuinit instead of __devinit for cpu callbacks workqueue: rename manager_mutex to assoc_mutex workqueue: WORKER_REBIND is no longer necessary for idle rebinding workqueue: WORKER_REBIND is no longer necessary for busy rebinding workqueue: reimplement idle worker rebinding workqueue: deprecate __cancel_delayed_work() workqueue: reimplement cancel_delayed_work() using try_to_grab_pending() workqueue: use mod_delayed_work() instead of __cancel + queue workqueue: use irqsafe timer for delayed_work workqueue: clean up delayed_work initializers and add missing one workqueue: make deferrable delayed_work initializer names consistent workqueue: cosmetic whitespace updates for macro definitions workqueue: deprecate system_nrt[_freezable]_wq workqueue: deprecate flush[_delayed]_work_sync() ...	2012-10-02 09:54:49 -07:00
Roland Dreier	d172f5a4ab	Merge branches 'cma', 'cxgb4', 'ipoib', 'mlx4', 'mlx4-sriov', 'nes', 'qib' and 'srp' into for-linus	2012-10-02 07:43:59 -07:00
Eric Dumazet	8112ec3b87	mlx4: dont orphan skbs in mlx4_en_xmit() After commit `e22979d96a` (mlx4_en: Moving to Interrupts for TX completions) we no longer need to orphan skbs in mlx4_en_xmit() since skb wont stay a long time in TX ring before their release. Orphaning skbs in ndo_start_xmit() should be avoided as much as possible, since it breaks TCP Small Queue or other flow control mechanisms (per socket limits) Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-10-01 17:01:57 -04:00
Linus Torvalds	fdb2f9c2eb	PCI changes for the 3.7 merge window: Host bridge hotplug - Protect acpi_pci_drivers and acpi_pci_roots (Taku Izumi) - Clear host bridge resource info to avoid issue when releasing (Yinghai Lu) - Notify acpi_pci_drivers when hot-plugging host bridges (Jiang Liu) - Use standard list ops for acpi_pci_drivers (Jiang Liu) Device hotplug - Use pci_get_domain_bus_and_slot() to close hotplug races (Jiang Liu) - Remove fakephp driver (Bjorn Helgaas) - Fix VGA ref count in hotplug remove path (Yinghai Lu) - Allow acpiphp to handle PCIe ports without native hotplug (Jiang Liu) - Implement resume regardless of pciehp_force param (Oliver Neukum) - Make pci_fixup_irqs() work after init (Thierry Reding) Miscellaneous - Add pci_pcie_type(dev) and remove pci_dev.pcie_type (Yijing Wang) - Factor out PCI Express Capability accessors (Jiang Liu) - Add pcibios_window_alignment() so powerpc EEH can use generic resource assignment (Gavin Shan) - Make pci_error_handlers const (Stephen Hemminger) - Cleanup drivers/pci/remove.c (Bjorn Helgaas) - Improve Vendor-Specific Extended Capability support (Bjorn Helgaas) - Use standard list ops for bus->devices (Bjorn Helgaas) - Avoid kmalloc in pci_get_subsys() and pci_get_class() (Feng Tang) - Reassign invalid bus number ranges (Intel DP43BF workaround) (Yinghai Lu) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iQIcBAABAgAGBQJQac4hAAoJEPGMOI97Hn6zjZYP/iaqU9kjmgTsBbSyzB4oApv/ RRxo3I+ad9GF6XlMQfVAtyx1pgCD1gdGAtoDgGSCTqgdYD3CO10AxKU+yleAk1wo dbMxLifJNTrT3G1mZ/NL16yEGhCwvhfwzRtB1VoZmCT4lSApO/7cJkXl2DzHfA/i pmltOOiQCN8kbUcJbVPtUyTVPi2zl/8bsyCyTkS7YG0VXeGRM+ZUvPWZJ7MnWYYB 5qoCdrw5ENCCiDQ9yw5SAfgL23b+0p6OI/x3Lkex0QQOWwSqGSiaHt4b7eitrC5b 2eAJg32f/AzZke1YbKLMfdsL0VJP3GAswhDVHlgmo63rZkOZChm+97dgZ35Mcv5v kEXkWyBb1xJ3t8rZir6Qer9Iv2wOB+MkZ5qtU/Vf+l0wLQLYTrRVsKngrEDREONk dXbokp6iVSPeA1sTSdH9MmHlTUIj82ZLSGcxcjTsN8NWZjxx6g3rNx1uay+5MYOW 4ET9zNu5snrAqN6N4Tb81gvtG8qYfxzdvVfrA9AaGKI6xxB7jkqgFJRp55JiEcFc x4cmWkhvdlhVsG2TQwFxYNfswOqD+7NCs6V4kSVZX6ezpDrH7I5VvcnnhstF7C8l KZul0EV7OW+kDK23pNe24lVP2xtOv6G8eK/3PmeKIXWl1V83nqre/oLufRzTfs+Z SxkILwY/MFpuCFteKE1t =haBu -----END PGP SIGNATURE----- Merge tag 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI changes from Bjorn Helgaas: "Host bridge hotplug - Protect acpi_pci_drivers and acpi_pci_roots (Taku Izumi) - Clear host bridge resource info to avoid issue when releasing (Yinghai Lu) - Notify acpi_pci_drivers when hot-plugging host bridges (Jiang Liu) - Use standard list ops for acpi_pci_drivers (Jiang Liu) Device hotplug - Use pci_get_domain_bus_and_slot() to close hotplug races (Jiang Liu) - Remove fakephp driver (Bjorn Helgaas) - Fix VGA ref count in hotplug remove path (Yinghai Lu) - Allow acpiphp to handle PCIe ports without native hotplug (Jiang Liu) - Implement resume regardless of pciehp_force param (Oliver Neukum) - Make pci_fixup_irqs() work after init (Thierry Reding) Miscellaneous - Add pci_pcie_type(dev) and remove pci_dev.pcie_type (Yijing Wang) - Factor out PCI Express Capability accessors (Jiang Liu) - Add pcibios_window_alignment() so powerpc EEH can use generic resource assignment (Gavin Shan) - Make pci_error_handlers const (Stephen Hemminger) - Cleanup drivers/pci/remove.c (Bjorn Helgaas) - Improve Vendor-Specific Extended Capability support (Bjorn Helgaas) - Use standard list ops for bus->devices (Bjorn Helgaas) - Avoid kmalloc in pci_get_subsys() and pci_get_class() (Feng Tang) - Reassign invalid bus number ranges (Intel DP43BF workaround) (Yinghai Lu)" * tag 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (102 commits) PCI: acpiphp: Handle PCIe ports without native hotplug capability PCI/ACPI: Use acpi_driver_data() rather than searching acpi_pci_roots PCI/ACPI: Protect acpi_pci_roots list with mutex PCI/ACPI: Use acpi_pci_root info rather than looking it up again PCI/ACPI: Pass acpi_pci_root to acpi_pci_drivers' add/remove interface PCI/ACPI: Protect acpi_pci_drivers list with mutex PCI/ACPI: Notify acpi_pci_drivers when hot-plugging PCI root bridges PCI/ACPI: Use normal list for struct acpi_pci_driver PCI/ACPI: Use DEVICE_ACPI_HANDLE rather than searching acpi_pci_roots PCI: Fix default vga ref_count ia64/PCI: Clear host bridge aperture struct resource x86/PCI: Clear host bridge aperture struct resource PCI: Stop all children first, before removing all children Revert "PCI: Use hotplug-safe pci_get_domain_bus_and_slot()" PCI: Provide a default pcibios_update_irq() PCI: Discard __init annotations for pci_fixup_irqs() and related functions PCI: Use correct type when freeing bus resource list PCI: Check P2P bridge for invalid secondary/subordinate range PCI: Convert "new_id"/"remove_id" into generic pci_bus driver attributes xen-pcifront: Use hotplug-safe pci_get_domain_bus_and_slot() ...	2012-10-01 12:05:36 -07:00
Roland Dreier	aadf4f3f66	mlx4_core: Disable SENSE_PORT for multifunction devices In the current driver, the SENSE_PORT firmware command is issued as a "wrapped" command, but the command handling code doesn't have a wrapper, so it will never do anything other than log an error message. The latest ConnectX-3 2.11.500 firmware reports the SENSE_PORT capability even in multi-function (SR-IOV) mode, so the driver will try to issue the command. At least until the driver has a proper wrapper for SENSE_PORT, make sure we disable the command for multi-function devices. Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-01 02:10:45 -07:00
Roland Dreier	ca3e57a599	mlx4_core: Clean up enabling of SENSE_PORT for older (ConnectX-1/-2) HCAs Instead of having a hard-coded "PCI device ID != 0x1003" (which obviously breaks as newer devices with ID != 0x1003 become available), instead let's set a flag in our PCI device table for the older devices where we're supposed to force using SENSE_PORT. This also avoids enabling SENSE_PORT for virtual functions by mistake. Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-01 02:10:44 -07:00
Roland Dreier	839f12434c	mlx4_core: Stash PCI ID driver_data in mlx4_priv structure That way we can check flags later on, when we've finished with the pci_device_id structure. Also convert the "is VF" flag to an enum: "Never do in the preprocessor what can be done in C." Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-10-01 02:10:39 -07:00
Roland Dreier	f3d4c89ee4	mlx4_core: Fix crash on uninitialized priv->cmd.slave_sem On an SR-IOV master device, __mlx4_init_one() calls mlx4_init_hca() before mlx4_multi_func_init(). However, for unlucky configurations, mlx4_init_hca() might call mlx4_SENSE_PORT() (via mlx4_dev_cap()), and that calls mlx4_cmd_imm() with MLX4_CMD_WRAPPED set. However, on a multifunction device with MLX4_CMD_WRAPPED, __mlx4_cmd() calls into mlx4_slave_cmd(), and that immediately tries to do down(&priv->cmd.slave_sem); but priv->cmd.slave_sem isn't initialized until mlx4_multi_func_init() (which we haven't called yet). The next thing it tries to do is access priv->mfunc.vhcr, but that hasn't been allocated yet. Fix this by moving the initialization of slave_sem and vhcr up into mlx4_cmd_init(). Also, since slave_sem is really just being used as a mutex, convert it into a slave_cmd_mutex. Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:46 -07:00
Roland Dreier	84b1f1538f	mlx4_core: Trivial cleanups to driver log messages Also put format string onto one line. Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:46 -07:00
Roland Dreier	69612b9ff4	mlx4_core: Trivial readability fix: "0X30" -> "0x30" Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:45 -07:00
Jack Morgenstein	47605df953	mlx4: Modify proxy/tunnel QP mechanism so that guests do no calculations Previously, the structure of a guest's proxy QPs followed the structure of the PPF special qps (qp0 port 1, qp0 port 2, qp1 port 1, qp1 port 2, ...). The guest then did offset calculations on the sqp_base qp number that the PPF passed to it in QUERY_FUNC_CAP(). This is now changed so that the guest does no offset calculations regarding proxy or tunnel QPs to use. This change frees the PPF from needing to adhere to a specific order in allocating proxy and tunnel QPs. Now QUERY_FUNC_CAP provides each port individually with its proxy qp0, proxy qp1, tunnel qp0, and tunnel qp1 QP numbers, and these are used directly where required (with no offset calculations). To accomplish this change, several fields were added to the phys_caps structure for use by the PPF and by non-SR-IOV mode: base_sqpn -- in non-sriov mode, this was formerly sqp_start. base_proxy_sqpn -- the first physical proxy qp number -- used by PPF base_tunnel_sqpn -- the first physical tunnel qp number -- used by PPF. The current code in the PPF still adheres to the previous layout of sqps, proxy-sqps and tunnel-sqps. However, the PPF can change this layout without affecting VF or (paravirtualized) PF code. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:43 -07:00
Jack Morgenstein	afa8fd1db9	mlx4: Paravirtualize Node Guids for slaves This is necessary in order to support > 1 VF/PF in a VM for software that uses the node guid as a discriminator, such as librdmacm. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:43 -07:00
Jack Morgenstein	026149cbaa	mlx4: Activate SR-IOV mode for IB Remove the error returns for IB ports from mlx4_ib_add, mlx4_INIT_PORT_wrapper, and mlx4_CLOSE_PORT_wrapper. Currently, SRIOV is supported only for devices for which the link layer is IB on all ports; RoCE support will be added later. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:42 -07:00
Jack Morgenstein	992e8e6e87	IB/mlx4: Miscellaneous adjustments for SR-IOV IB support 1. Allow only master to change node description. 2. Prevent AH leakage in send mads. 3. Take device part number from PCI structure, so that guests see the VF part number (and not the PF part number). 4. Place the device revision ID into caps structure at startup. 5. SET_PORT in update_gids_task needs to go through wrapper on master. 6. In mlx4_ib_event(), PORT_MGMT_EVENT needs be handled in a work queue on the master, since it propagates events to slaves using GEN_EQE. 7. Do not support FMR on slaves. 8. Add spinlock to slave_event(), since it is called both in interrupt context and in process context (due to 6 above, and also if smp_snoop is used). This fix was found and implemented by Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:41 -07:00
Jack Morgenstein	980e90010f	mlx4_core: INIT/CLOSE port logic for IB ports in SR-IOV mode Normally, INIT_PORT and CLOSE_PORT are invoked when special QP0 transitions to RTR, or transitions to ERR/RESET respectively. In SR-IOV mode, however, the master is also paravirtualized. This in turn requires that we not do INIT_PORT until the entire QP0 path (real QP0 and proxy QP0) is ready to receive. When the real QP0 goes down, we should indicate that the port is not active. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:40 -07:00
Jack Morgenstein	efcd235d73	net/mlx4_core: Adjustments to SET_PORT for IB SR-IOV 1. Slaves may not set the IS_SM capability for the port. 2. DEV_MGMT may not be set in multifunction mode. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:40 -07:00
Jack Morgenstein	2a4fae148c	IB/mlx4: Propagate P_Key and guid change port management events to slaves P_Key change and guid change events are not of interest to all slaves, but only to those slaves which "see" the table slots whose contents have change. For example, if the guid at port 1, index 5 has changed in the PPF, we wish to propagate the gid-change event only to the function which has that guid index mapped to its port/guid table (in this case it is slave #5). Other functions should not get the event, since the event does not affect them. Similarly with P_Keys -- P_Key change events are forwarded only to slaves which have that P_Key index mapped to their virtual P_Key table. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:38 -07:00
Jack Morgenstein	a0c64a17ab	mlx4: Add alias_guid mechanism For IB ports, we paravirtualize the GUID at index 0 on slaves. The GUID at index 0 seen by a slave is the actual GUID occupying the GUID table at the slave-id index. The driver, by default, requests at startup time that subnet manager populate its entire guid table with GUIDs. These guids are then mapped (paravirtualized) to the slaves, and appear for each slave as its GUID at index 0. Until each slave has such a guid, its port status is DOWN. The guid table is cached to support special QP paravirtualization, and event propagation to slaves on guid change (we test to see if the guid really changed before propagating an event to the slave). To support this caching, add capability to __mlx4_ib_query_gid() to obtain the network view (i.e., physical view) gid at index X, not just the host (paravirtualized) view. Based on a patch from Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:37 -07:00
Jack Morgenstein	993c401e20	mlx4_core: Add IB port-state machine and port mgmt event propagation For an IB port, a slave should not show port active until that slave has a valid alias-guid (provided by the subnet manager). Therefore the port-up event should be passed to a slave only after both the port is up, and the slave's alias-guid has been set. Also, provide the infrastructure for propagating port-management events (client-reregister, etc) to slaves. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:37 -07:00
Jack Morgenstein	0a9a01884d	mlx4: MAD_IFC paravirtualization The MAD_IFC firmware command fulfills two functions. First, it is used in the QP0/QP1 MAD-handling flow to obtain information from the FW (for answering queries), and for setting variables in the HCA (MAD SET packets). For this, MAD_IFC should provide the FW (physical) view of the data. This is the view that OpenSM needs. We call this the "network view". In the second case, MAD_IFC is used by various verbs to obtain data regarding the local HCA (e.g., ib_query_device()). We call this the "host view". This data needs to be paravirtualized. MAD_IFC therefore needs a wrapper function, and also needs another flag indicating whether it should provide the network view (when it is called by ib_process_mad in special-qp packet handling), or the host view (when it is called while implementing a verb). There are currently 2 flag parameters in mlx4_MAD_IFC already: ignore_bkey and ignore_mkey. These two parameters are replaced by a single "mad_ifc_flags" parameter, with different bits set for each flag. A third flag is added: "network-view/host-view". Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:34 -07:00
Jack Morgenstein	54679e1482	mlx4: Implement QP paravirtualization and maintain phys_pkey_cache for smp_snoop This requires: 1. Replacing the paravirtualized P_Key index (inserted by the guest) with the real P_Key index. 2. For UD QPs, placing the guest's true source GID index in the address path structure mgid field, and setting the ud_force_mgid bit so that the mgid is taken from the QP context and not from the WQE when posting sends. 3. For UC and RC QPs, placing the guest's true source GID index in the address path structure mgid field. 4. For tunnel and proxy QPs, setting the Q_Key value reserved for that proxy/tunnel pair. Since not all the above adjustments occur in all the QP transitions, the QP transitions require separate wrapper functions. Secondly, initialize the P_Key virtualization table to its default values: Master virtualized table is 1-1 with the real P_Key table, guest virtualized table has P_Key index 0 mapped to the real P_Key index 0, and all the other P_Key indices mapped to the reserved (invalid) P_Key at index 127. Finally, add logic in smp_snoop for maintaining the phys_P_Key_cache. and generating events on the master only if a P_Key actually changed. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:33 -07:00
Jack Morgenstein	fc06573dfa	IB/mlx4: Initialize SR-IOV IB support for slaves in master context Allocate SR-IOV paravirtualization resources and MAD demuxing contexts on the master. This has two parts. The first part is to initialize the structures to contain the contexts. This is done at master startup time in mlx4_ib_init_sriov(). The second part is to actually create the tunneling resources required on the master to support a slave. This is performed the master detects that a slave has started up (MLX4_DEV_EVENT_SLAVE_INIT event generated when a slave initializes its comm channel). For the master, there is no such startup event, so it creates its own tunneling resources when it starts up. In addition, the master also creates the real special QPs. The ib_core layer on the master causes creation of proxy special QPs, since the master is also paravirtualized at the ib_core layer. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:32 -07:00
Jack Morgenstein	e2c76824ca	mlx4_core: Add proxy and tunnel QPs to the reserved QP area In addition, pass the proxy and tunnel QP numbers to slaves so the driver can perform special QP paravirtualization. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:31 -07:00
Or Gerlitz	a084feebd2	mlx4_core: Remove annoying debug message in the resource tracker This innocent print makes it very hard to actually use the mlx4_core debug messages -- for example, the module load sequence of a device with two VFs yielded 3200 debug prints, with 2800 of them being this one. Let's just remove it. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:09 -07:00
Dotan Barak	426dd00d76	mlx4_core: Fix wrong offset in parsing query device caps response The wrong offset was used when parsing the number of XRCs in mlx4_QUERY_DEV_CAP(). Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-30 20:33:08 -07:00
Linus Torvalds	2ade0b7f79	Last set of InfiniBand/RDMA fixes for 3.6: - Couple more IPoIB fixes for regressions introduced by path database conversion - Minor other fixes to low-level drivers (cxgb4, mlx4, qib, ocrdma) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJQV0eiAAoJEENa44ZhAt0hTScP/3+Y8v4w3Nyop7iNj9uvnsCn PhgfY9XUp9d0zOHM7XK+pTGLJheLLiyXfcCFTwgoC6tIyj7J1ulhYv+RBr7GbS+e K7aG7QtLDyM17ETgOBVIAvkz1CFKwdI843xRQKo0/oVMQJ9XLHg0nvzd+XaxZuud C+7Lfl5RH9Bl1tHJuOZmdyJinSQyLO/uV+tqwL5upF5YMTCJ2XsmVa1HQKhOAkBE bH4zXtFViNEKVuKXkpGc1yJFKZiwysirluYXedDkkEdr2jf9Mysw5JvZKVUOCh+4 qYMjEAWQaUBia/tFUs9k6iCwE8ws7tRC4OSisb3cVzCUaqkfmx1Sp7vAQcDqx4xB r2dy4oc+vEFdrmjerOVEgmdgTwnDrWMjUlPuGyRX9935I7ybKA7NQPGlImKU8oiq 8vyptsTN6r7AYc7kw6aYaE9kqGPnST7I3cZg71lHfwCuHKMNYfkxd1Mqxx/W6rMZ G1VnstVdioQK2HvcFIeArXQ6H6QjUAY92WMoyhIsrU4KwEx2Zpr0pfKWCD0eIOuB f4s9v0f5/6DqSwJ9uhhpAW/5bmV+RUCDjm8Ep9i70bGk7SliwT0MNPhHbaMfm7aU c+awy3HWpWKGCtod9zUzJH/4NsHzpggdHUlyAOuU5nqsh9GarYb0cvLjTNfsfdyV haQtMR+fSUK7Durd9MQl =nqvR -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA fixes from Roland Dreier: - A couple more IPoIB fixes for regressions introduced by path database conversion - Minor other fixes to low-level drivers (cxgb4, mlx4, qib, ocrdma) * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/qib: Fix failure of compliance test C14-024#06_LocalPortNum RDMA/ocrdma: Fix CQE expansion of unsignaled WQE mlx4_core: Fix integer overflows so 8TBs of memory registration works IPoIB: Fix AB-BA deadlock when deleting neighbours IPoIB: Fix memory leak in the neigh table deletion flow RDMA/cxgb4: Move dereference below NULL test	2012-09-17 13:21:02 -07:00
Yishai Hadas	dd03e73481	mlx4_core: Fix integer overflows so 8TBs of memory registration works This patch adds on the fixes done in commits `89dd86db78` ("mlx4_core: Allow large mlx4_buddy bitmaps") and `3de819e6b6` ("mlx4_core: Fix integer overflow issues around MTT table") so that memory registration of up to 8TB (log_num_mtt=31) finally works. It fixes integer overflows in a few mlx4_table_yyy routines in icm.c by using a u64 intermediate variable, and int/uint issues that caused table indexes to become nagive by setting some variables to be u32 instead of int. These problems cause crashes when a user attempted to register > 512GB of RAM. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-09-13 17:52:02 -07:00
Bjorn Helgaas	78890b5989	Merge commit 'v3.6-rc5' into next * commit 'v3.6-rc5': (1098 commits) Linux 3.6-rc5 HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured Remove user-triggerable BUG from mpol_to_str xen/pciback: Fix proper FLR steps. uml: fix compile error in deliver_alarm() dj: memory scribble in logi_dj Fix order of arguments to compat_put_time[spec\|val] xen: Use correct masking in xen_swiotlb_alloc_coherent. xen: fix logical error in tlb flushing xen/p2m: Fix one-off error in checking the P2M tree directory. powerpc: Don't use __put_user() in patch_instruction powerpc: Make sure IPI handlers see data written by IPI senders powerpc: Restore correct DSCR in context switch powerpc: Fix DSCR inheritance in copy_thread() powerpc: Keep thread.dscr and thread.dscr_inherit in sync powerpc: Update DSCR on all CPUs when writing sysfs dscr_default powerpc/powernv: Always go into nap mode when CPU is offline powerpc: Give hypervisor decrementer interrupts their own handler powerpc/vphn: Fix arch_update_cpu_topology() return value ARM: gemini: fix the gemini build ... Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c drivers/rapidio/devices/tsi721.c	2012-09-13 08:41:01 -06:00
Bjorn Helgaas	1959ec5f82	Merge branch 'pci/stephen-const' into next * pci/stephen-const: make drivers with pci error handlers const scsi: make pci error handlers const netdev: make pci_error_handlers const PCI: Make pci_error_handlers const	2012-09-12 13:54:10 -06:00
Stephen Hemminger	3646f0e5c9	netdev: make pci_error_handlers const Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-09-07 16:35:00 -06:00
Eugenia Emantayev	521130d11f	net/mlx4_core: Return the error value in case of command initialization failure If mlx4_cmd_init() failed, the init_one function returned success, although no resources were opened. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 12:55:59 -04:00
Aviad Yehezkel	bef772eb06	net/mlx4_core: Fixing error flow in case of QUERY_FW failure The order of operations was wrong on the teardown flow. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 12:55:59 -04:00
Aviad Yehezkel	60d31c1475	net/mlx4_core: Looking for promiscuous entries on the correct port The search for promisc entries was always done on the first port, While the addition is done on the correct port. This lead to resource leackage of promisc entries on the second port and brought to a state where we could no longer enter to promiscuous mode after enough iterations of "ifconfig promisc" on the second port. Fix that by using the correct port when searching. Reported-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 12:55:59 -04:00
Hadar Hen Zion	7fb40f87c4	net/mlx4_core: Add security check / enforcement for flow steering rules set for VMs Since VFs may be mapped to VMs which aren't trusted entities, flow steering rules attached through the wrapper on behalf of VFs must be checked to make sure that their L2 specification relate to MAC address assigned to that VF, and add L2 specification if its missing. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 12:55:59 -04:00
Hadar Hen Zion	a8edc3bf05	net/mlx4_core: Put Firmware flow steering structures in common header files To allow for usage of the flow steering Firmware structures in more locations over the driver, such as the resource tracker, move them from mcg.c to common header files. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-07 12:55:59 -04:00
Jiang Liu	fadd1daa0b	mlx4: Use PCI Express Capability accessors Use PCI Express Capability access functions to simplify mlx4 driver. Signed-off-by: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-08-23 10:11:13 -06:00
Linus Torvalds	8f8ba75ee2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking update from David Miller: "A couple weeks of bug fixing in there. The largest chunk is all the broken crap Amerigo Wang found in the netpoll layer." 1) netpoll and it's users has several serious bugs: a) uses GFP_KERNEL with locks held b) interfaces requiring interrupts disabled are called with them enabled c) and vice versa d) VLAN tag demuxing, as per all other RX packet input paths, is not applied All from Amerigo Wang. 2) Hopefully cure the ipv4 mapped ipv6 address TCP early demux bugs for good, from Neal Cardwell. 3) Unlike AF_UNIX, AF_PACKET sockets don't set a default credentials when the user doesn't specify one explicitly during sendmsg(). Instead we attach an empty (zero) SCM credential block which is definitely not what we want. Fix from Eric Dumazet. 4) IPv6 illegally invokes netdevice notifiers with RCU lock held, fix from Ben Hutchings. 5) inet_csk_route_child_sock() checks wrong inet options pointer, fix from Christoph Paasch. 6) When AF_PACKET is used for transmit, packet loopback doesn't behave properly when a socket fanout is enabled, from Eric Leblond. 7) On bluetooth l2cap channel create failure, we leak the socket, from Jaganath Kanakkassery. 8) Fix all the netprio file handling bugs found by Al Viro, from John Fastabend. 9) Several error return and NULL deref bug fixes in networking drivers from Julia Lawall. 10) A large smattering of struct padding et al. kernel memory leaks to userspace found of Mathias Krause. 11) Conntrack expections in netfilter can access an uninitialized timer, fix from Pablo Neira Ayuso. 12) Several netfilter SIP tracker bug fixes from Patrick McHardy. 13) IPSEC ipv6 routes are not initialized correctly all the time, resulting in an OOPS in inet_putpeer(). Also from Patrick McHardy. 14) Bridging does rcu_dereference() outside of RCU protected area, from Stephen Hemminger. 15) Fix routing cache removal performance regression when looking up output routes that have a local destination. From Zheng Yan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits) af_netlink: force credentials passing [CVE-2012-3520] ipv4: fix ip header ident selection in __ip_make_skb() ipv4: Use newinet->inet_opt in inet_csk_route_child_sock() tcp: fix possible socket refcount problem net: tcp: move sk_rx_dst_set call after tcp_create_openreq_child() net/core/dev.c: fix kernel-doc warning netconsole: remove a redundant netconsole_target_put() net: ipv6: fix oops in inet_putpeer() net/stmmac: fix issue of clk_get for Loongson1B. caif: Do not dereference NULL in chnl_recv_cb() af_packet: don't emit packet on orig fanout group drivers/net/irda: fix error return code drivers/net/wan/dscc4.c: fix error return code drivers/net/wimax/i2400m/fw.c: fix error return code smsc75xx: add missing entry to MAINTAINERS net: qmi_wwan: new devices: UML290 and K5006-Z net: sh_eth: Add eth support for R8A7779 device netdev/phy: skip disabled mdio-mux nodes dt: introduce for_each_available_child_of_node, of_get_next_available_child net: netprio: fix cgrp create and write priomap race ...	2012-08-21 16:46:08 -07:00
Tejun Heo	203b42f731	workqueue: make deferrable delayed_work initializer names consistent Initalizers for deferrable delayed_work are confused. * __DEFERRED_WORK_INITIALIZER() * DECLARE_DEFERRED_WORK() * INIT_DELAYED_WORK_DEFERRABLE() Rename them to * __DEFERRABLE_WORK_INITIALIZER() * DECLARE_DEFERRABLE_WORK() * INIT_DEFERRABLE_WORK() This patch doesn't cause any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>	2012-08-21 13:18:23 -07:00
Linus Torvalds	846b99964a	Grab bag of InfiniBand/RDMA fixes: - IPoIB fixes for regressions introduced by path database conversion - mlx4 fixes for bugs with large memory systems and regressions from SR-IOV patches - RDMA CM fix for passing bad event up to userspace - Other minor fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJQLo9cAAoJEENa44ZhAt0hPeUP/Ag1i164UE2e64sTfbVUfbEM 7a+acsBo2ByQhVaXd65PZqm6+aYjosSwkStl0u4cb9Q3qQvYiZeLJj/qmInVV1hv qHuNytbnXl6W2f6vyz391TKx+X6uastlH7O4f4v0zS3zkrlkN9kCrN5dI1Fxm9ba 5hluyggs7pVmHPX7KDGnQNZEjCsKoJF2kzB0hXwqGo1JxB2sEGq3DV/uMTHPgKKM HlrTJEEb7pOeHHf2Zt8MQiHMC0YBJ6MmpVIidR02TdQdCgZQcpoYnn20odmNLK5D HTT4jbCSsbOiHPpdAn0YkgzXzL58a2/b1Tt/pNq+Rud7VKwFVuFhNElPyAxR1/1K tPGUbDA6eOfosU++QbDJ0QHxDTgBTyXwh4yMkwPjmdWJ2jwTQev74iiDXgDf3QbS tMmWSwfC38hBxvfomNA6QTPvY1c3NNvj3uO+xq4/q6HloGqSlnIrPBL9hF0BiVQ6 KteZ6gF+R1rHEIQraEnbNmh5Rxvi2RdgN0Xa0U9w/yr0LCowmUbFK35XoyTFyo/8 rF7xp85s1wH5OuaA0NQeivQEKbuBxl7nvJlC2RBdRT/B9U5now+Y2XKOX0+FfRsg Qm0By/K8ss2XftX+lNfip+ScSmu9p+kCBME0X8Ra590VDt0IcPiJsC9ozhKyywRy KfSGj3iZG/nDaec6i1aN =hN/H -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband/rdma fixes from Roland Dreier: "Grab bag of InfiniBand/RDMA fixes: - IPoIB fixes for regressions introduced by path database conversion - mlx4 fixes for bugs with large memory systems and regressions from SR-IOV patches - RDMA CM fix for passing bad event up to userspace - Other minor fixes" * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Check iboe netdev pointer before dereferencing it mlx4_core: Clean up buddy bitmap allocation mlx4_core: Fix integer overflow issues around MTT table mlx4_core: Allow large mlx4_buddy bitmaps IB/srp: Fix a race condition IB/qib: Fix error return code in qib_init_7322_variables() IB: Fix typos in infiniband drivers IB/ipoib: Fix RCU pointer dereference of wrong object IB/ipoib: Add missing locking when CM object is deleted RDMA/ucma.c: Fix for events with wrong context on iWARP RDMA/ocrdma: Don't call vlan_dev_real_dev() for non-VLAN netdevs IB/mlx4: Fix possible deadlock on sm_lock spinlock	2012-08-17 11:45:58 -07:00
Roland Dreier	96f17d5900	mlx4_core: Clean up buddy bitmap allocation - Use kcalloc() / vzalloc() instead of an extra bitmap_zero(). - Add __GFP_NOWARN to kcalloc() since we'll try vzalloc() if it fails. Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-08-15 21:05:27 -07:00
Yishai Hadas	3de819e6b6	mlx4_core: Fix integer overflow issues around MTT table Fix some issues around int variables used in data structures related to memory registration. Handle int overflow in mlx4_init_icm_table by using a u64 intermediate variable and changing struct mlx4_icm_table num_obj field to be u32. Change some more fields/variables to use u32 instead of int to prevent a case where the variable becomes negative when bit 31 is set. Also subtract log_mtts_per_seg from the exponent when computing num_mtt, since its added later on in that very same code area. This and the previous commit fixes some issues which actually prevent commit `db5a7a65c0` ("mlx4_core: Scale size of MTT table with system RAM") from working. Now, when the number of MTTs is scaled with the size of the RAM we can map up to 8TB. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-08-15 21:05:26 -07:00
Yishai Hadas	89dd86db78	mlx4_core: Allow large mlx4_buddy bitmaps mlx4_buddy_init uses kmalloc() to allocate bitmaps, which fails when the required size is beyond the max supported value (or when memory is too fragmented to handle a huge allocation). Extend this to use use vmalloc() if kmalloc() fails, and take that into account when freeing the bitmaps as well. This fixes a driver load failure when log num mtt is 26 or higher, and is a step in the direction of allowing to register huge amounts of memory on large memory systems. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-08-15 21:05:20 -07:00
Julia Lawall	499b95f6b3	drivers/net/ethernet/mellanox/mlx4/mcg.c: fix error return code Convert a 0 error return code to a negative one, as returned elsewhere in the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier ret; expression e,e1,e2,e3,e4,x; @@ ( if (\(ret != 0\\|ret < 0\) \|\| ...) { ... return ...; } \| ret = 0 ) ... when != ret = e1 x = \(kmalloc\\|kzalloc\\|kcalloc\\|devm_kzalloc\\|ioremap\\|ioremap_nocache\\|devm_ioremap\\|devm_ioremap_nocache\)(...); ... when != x = e2 when != ret = e3 if (x == NULL \|\| ...) { ... when != ret = e4 * return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-14 17:00:57 -07:00
Yevgeny Petrilin	2207b60ffb	net/mlx4_core: Remove port type restrictions Port1=Eth, Port2=IB restriction is no longer required. Having RoCE, there will always rdma port initialized over ConnectX physical port, no matter whether the link layer is IB or Ethernet. So we always have dual port IB device. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-03 16:49:40 -07:00
Yevgeny Petrilin	c18520bd1b	net/mlx4_en: Fixing TX queue stop/wake flow Removing the ring->blocked flag, it is redundant and leads to a race: We close the TX queue and then set the "blocked" flag. Between those 2 operations the completion function can check the "blocked" flag, sees that it is 0, and wouldn't open the TX queue. Using netif_tx_queue_stopped to check the state of the queue to avoid this race. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-03 16:49:02 -07:00
Amir Vadai	c8c40b7f32	net/mlx4_en: loopbacked packets are dropped when SMAC=DMAC Should NOT check SMAC=DMAC when: 1. loopback is turned on 2. validate_loopback is true. Fixed it accordingly. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-08-03 16:49:02 -07:00
Linus Torvalds	1e30c1b386	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking updates and fixes from David Miller: 1) Reinstate the no-ref optimization for input route lookups in ipv4 to fix some routing cache removal perf regressions. 2) Make TCP socket pre-demux work on ipv6 side too, from Eric Dumazet. 3) Get RX hash value from correct place in be2net driver, from Sarveshwar Bandi. 4) Validation of FIB cached routes missing critical check, from Eric Dumazet. 5) EEH support in mlx4 driver, from Kleber Sacilotto de Souza. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (23 commits) ipv6: Early TCP socket demux ipv4: Fix input route performance regression. pch_gbe: vlan skb len fix pch_gbe: add extra clean tx pch_gbe: fix transmit watchdog timeout ixgbe: fix panic while dumping packets on Tx hang with IOMMU be2net: Fix to parse RSS hash from Receive completions correctly. net/mlx4_en: Limit the RFS filter IDs to be < RPS_NO_FILTER hyperv: Add error handling to rndis_filter_device_add() hyperv: Add a check for ring_size value ipv4: rt_cache_valid must check expired routes net/pch_gpe: Cannot disable ethernet autonegation qeth: repair crash in qeth_l3_vlan_rx_kill_vid() netiucv: cleanup attribute usage net: wiznet add missing HAS_IOMEM dependency be2net: Missing byteswap in be_get_fw_log_level causes oops on PowerPC mlx4: Add support for EEH error recovery cdc-ncm: tag Ericsson WWAN devices (eg F5521gw) with FLAG_WWAN wanmain: comparing array with NULL caif: fix NULL pointer check ...	2012-07-26 18:09:01 -07:00
Amir Vadai	ee64c0ee51	net/mlx4_en: Limit the RFS filter IDs to be < RPS_NO_FILTER RFS filter id can't have the special value RPS_NO_FILTER, need to skip it when allocating id's. CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-26 00:23:55 -07:00
Kleber Sacilotto de Souza	57dbf29a54	mlx4: Add support for EEH error recovery Currently the mlx4 drivers don't have the necessary callbacks to implement EEH errors detection and recovery, so the PCI layer uses the probe and remove callbacks to try to recover the device after an error on the bus. However, these callbacks have race conditions with the internal catastrophic error recovery functions, which will also detect the error and this can cause the system to crash if both EEH and catas functions try to reset the device. This patch adds the necessary error recovery callbacks and makes sure that the internal catastrophic error functions will not try to reset the device in such scenarios. It also adds some calls to pci_channel_offline() to suppress reads/writes on the bus when the slot cannot accept I/O operations so we prevent unnecessary accesses to the bus and speed up the device removal. Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Acked-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-25 15:24:13 -07:00
Linus Torvalds	5dedb9f3bd	InfiniBand/RDMA changes for the 3.6 merge window: - Updates to the qib low-level driver - First chunk of changes for SR-IOV support for mlx4 IB - RDMA CM support for IPv6-only binding - Other misc cleanups and fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJQDXhUAAoJEENa44ZhAt0hZxwQAJydS9f9me31AGa45SAq8rA8 6e3LgnQ6jS38he7LZZdSsT7g25jROCKOcTj6VUkNSVAVSkK8zcp7wngjDxw50IK6 FF9hNeljMkWBflOhxzB34DRGCW4b4J+Yt1o7v1RtawiG/Mhri/6imKL0Aqjt4GwX a1MPZn+xI2osLujfdHJtATPWWB9jCaXdFe4DJUNPdqJhS6TN7s8OP3XMiqoJtdaV ptHeSSbjdUR1mg/h2LU2FVmWHXNSBxn7MEsrBBRkQVyiEkXieFBLwPTp4DqmAUJf xugm6Hf4sbiQ+QuU0baJODt56wveuYVQ4HKUzE0urUFXyU4TUB9blrehWZnKsRte wo/w4nvVlnXgGhHH0Igq76RDX8aCwc/6uQJ/29oChWjrei3HE0LjmIlPAu0vAhyw ViLe02/r2gKXQv1NxIqhPmGsJTZizg2mUk2eEJHHPQb/NGL7iNo6b+141AfURoqQ goGmlGdzffCRpmo6FXFZ57RPVRS4gwMunCY/Pmvq5a2t4oZh8899l2+V3N7bCfvH +JdavxAjia9U4IlgPsAqVaz8z8TyHOY/lEd75Wnw8q9www3kLx7hASvHsdwrS4LL ihzECzsaXOSoIXQzghs+6iyp1pzmzE9ve8OqbIGJStzlrOLyn7Gjo+Ixwm0SLsTO I7h9iJ1OMKFZ8bzmHsWb =f09j -----END PGP SIGNATURE----- Merge tag 'rdma-for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA changes from Roland Dreier: - Updates to the qib low-level driver - First chunk of changes for SR-IOV support for mlx4 IB - RDMA CM support for IPv6-only binding - Other misc cleanups and fixes Fix up some add-add conflicts in include/linux/mlx4/device.h and drivers/net/ethernet/mellanox/mlx4/main.c * tag 'rdma-for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (30 commits) IB/qib: checkpatch fixes IB/qib: Add congestion control agent implementation IB/qib: Reduce sdma_lock contention IB/qib: Fix an incorrect log message IB/qib: Fix QP RCU sparse warnings mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and paravirtualize them mlx4_core: Allow guests to have IB ports mlx4_core: Implement mechanism for reserved Q_Keys net/mlx4_core: Free ICM table in case of error IB/cm: Destroy idr as part of the module init error flow mlx4_core: Remove double function declarations IB/mlx4: Fill the masked_atomic_cap attribute in query device IB/mthca: Fill in sq_sig_type in query QP IB/mthca: Warning about event for non-existent QPs should show event type IB/qib: Fix sparse RCU warnings in qib_keys.c net/mlx4_core: Initialize IB port capabilities for all slaves mlx4: Use port management change event instead of smp_snoop IB/qib: RCU locking for MR validation IB/qib: Avoid returning EBUSY from MR deregister IB/qib: Fix UC MR refs for immediate operations ...	2012-07-24 13:56:26 -07:00
Roland Dreier	089117e1ad	Merge branches 'cma', 'cxgb4', 'misc', 'mlx4-sriov', 'mlx-cleanups', 'ocrdma' and 'qib' into for-linus	2012-07-22 23:26:17 -07:00
Thadeu Lima de Souza Cascardo	4cce66cdd1	mlx4_en: map entire pages to increase throughput In its receive path, mlx4_en driver maps each page chunk that it pushes to the hardware and unmaps it when pushing it up the stack. This limits throughput to about 3Gbps on a Power7 8-core machine. One solution is to map the entire allocated page at once. However, this requires that we keep track of every page fragment we give to a descriptor. We also need to work with the discipline that all fragments will be released (in the sense that it will not be reused by the driver anymore) in the order they are allocated to the driver. This requires that we don't reuse any fragments, every single one of them must be reallocated. We do that by releasing all the fragments that are processed and only after finished processing the descriptors, we start the refill. We also must somehow guarantee that we either refill all fragments in a descriptor or none at all, without resorting to giving up a page fragment that we would have already given. Otherwise, we would break the discipline of only releasing the fragments in the order they were allocated. This has passed page allocation fault injections (restricted to the driver by using required-start and required-end) and device hotplug while 16 TCP streams were able to deliver more than 9Gbps. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 10:53:13 -07:00
Amir Vadai	1eb8c695bd	net/mlx4_en: Add accelerated RFS support Use RFS infrastructure and flow steering in HW to keep CPU affinity of rx interrupts and application per TCP stream. A flow steering filter is added to the HW whenever the RFS ndo callback is invoked by core networking code. Because the invocation takes place in interrupt context, the actual setup of HW is done using workqueue. Whenever new filter is added, the driver checks for expiry of existing filters. Since there's window in time between the point where the core RFS code invoked the ndo callback, to the point where the HW is configured from the workqueue context, the 2nd, 3rd etc packets from that stream will cause the net core to invoke the callback again and again. To prevent inefficient/double configuration of the HW, the filters are kept in a database which is indexed using hash function to enable fast access. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 08:34:37 -07:00
Amir Vadai	d9236c3f10	{NET,IB}/mlx4: Add rmap support to mlx4_assign_eq Enable callers of mlx4_assign_eq to supply a pointer to cpu_rmap. If supplied, the assigned IRQ is tracked using rmap infrastructure. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 08:34:37 -07:00
Amir Vadai	af22d9de45	net/mlx4: Move MAC_MASK to a common place Define this macro is one common place instead of duplicating it over the code Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 08:34:37 -07:00
Dan Carpenter	9c64508af2	net/mlx4_en: dereferencing freed memory We dereferenced "mclist" after the kfree(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-16 22:58:07 -07:00
Dan Carpenter	447458c01f	net/mlx4: off by one in parse_trans_rule() This should be ">=" here instead of ">". MLX4_NET_TRANS_RULE_NUM is 6. We use "spec->id" as an array offset into the __rule_hw_sz[] and __sw_id_hw[] arrays which have 6 elements. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-16 22:57:43 -07:00
Jack Morgenstein	6634961c14	mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and paravirtualize them To allow easy paravirtualization of P_Key and GID table sizes, keep paravirtualized sizes in mlx4_dev->caps, but save the actual physical sizes from FW in struct: mlx4_dev->phys_cap. In addition, in SR-IOV mode, do the following: 1. Reduce reported P_Key table size by 1. This is done to reserve the highest P_Key index for internal use, for declaring an invalid P_Key in P_Key paravirtualization. We require a P_Key index which always contain an invalid P_Key value for this purpose (i.e., one which cannot be modified by the subnet manager). The way to do this is to reduce the P_Key table size reported to the subnet manager by 1, so that it will not attempt to access the P_Key at index #127. 2. Paravirtualize the GID table size to 1. Thus, each guest sees only a single GID (at its paravirtualized index 0). In addition, since we are paravirtualizing the GID table size to 1, we add paravirtualization of the master GID event here (i.e., we do not do ib_dispatch_event() for the GUID change event on the master, since its (only) GUID never changes). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-07-11 11:52:23 -07:00
Jack Morgenstein	105c320f6a	mlx4_core: Allow guests to have IB ports Modify mlx4_dev_cap to allow IB support when SR-IOV is active. Modify mlx4_slave_cap to set the "rdma-supported" bit in its flags area, and pass that to the guests (this is done in QUERY_FUNC_CAP and its wrapper). However, we don't activate IB support quite yet -- we leave the error return at the start of mlx4_ib_add in the mlx4_ib driver. In addition, set "protected fmr supported" bit to zero in the QUERY_FUNC_CAP wrapper. Finally, in the QUERY_FUNC_CAP wrapper, we needed to add code which checks for the port type (IB or Ethernet). Previously, this was not an issue, since only Ethernet ports were supported. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-07-11 11:51:37 -07:00
Jack Morgenstein	396f2feb05	mlx4_core: Implement mechanism for reserved Q_Keys The SR-IOV special QP tunneling mechanism uses proxy special QPs (instead of the real special QPs) for MADs on guests. These proxy QPs send their packets to a "tunnel" QP owned by the master. The master then forwards the MAD (after any required paravirtualization) to the real special QP, which sends out the MAD. For security reasons (i.e., to prevent guests from sending MADs to tunnel QPs belonging to other guests), each proxy-tunnel QP pair is assigned a unique, reserved, Q_Key. These Q_Keys are available only for proxy and tunnel QPs -- if the guest tries to use these Q_Keys with other QPs, it will fail. This patch introduces a mechanism for reserving a block of 64K Q_Keys for proxy/tunneling use. The patch introduces also two new fields into mlx4_dev: base_sqpn and base_tunnel_sqpn. In SR-IOV mode, the QP numbers for the "real," proxy, and tunnel sqps are added to the reserved QPN area (so that they will not change). There are 8 special QPs per port in the HCA, and each of them is assigned both a proxy and a tunnel QP, for each VF and for the PF as well in SR-IOV mode. The QPNs for these QPs are arranged as follows: 1. The real SQP numbers (8) 2. The proxy SQPs (8 * (max number of VFs + max number of PFs) 3. The tunnel SQPs (8 * (max number of VFs + max number of PFs) To support these QPs, two new fields are added to struct mlx4_dev: base_sqp: this is the QP number of the first of the real SQPs base_tunnel_sqp: this is the qp number of the first qp in the tunnel sqp region. (On guests, this is the first tunnel sqp of the 8 which are assigned to that guest). In addition, in SR-IOV mode, sqp_start is the number of the first proxy SQP in the proxy SQP region. (In guests, this is the first proxy SQP of the 8 which are assigned to that guest) Note that in non-SR-IOV mode, there are no proxies and no tunnels. In this case, sqp_start is set to sqp_base -- which minimizes code changes. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-07-11 11:35:55 -07:00
Dotan Barak	240a9207aa	net/mlx4_core: Free ICM table in case of error In mlx4_init_icm_table(), free the allocated table if we failed to allocate memory to its entries. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-07-11 09:22:58 -07:00
Dotan Barak	f457ce471c	mlx4_core: Remove double function declarations Spotted four duplicate declarations in icm.h, remove them. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-07-11 09:22:58 -07:00
Jack Morgenstein	2aca1172c2	net/mlx4_core: Initialize IB port capabilities for all slaves With IB SR-IOV, each slave has its own separate copy of the port capabilities flags. For example, the master can run a subnet manager (which causes the IsSM bit to be set in the master's port capabilities) without affecting the port capabilities seen by the slaves (the IsSM bit will be seen as cleared in the slaves). Also add a static inline mlx4_master_func_num() to enhance readability of the code. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-07-10 09:57:06 -07:00
Jack Morgenstein	00f5ce99dc	mlx4: Use port management change event instead of smp_snoop The port management change event can replace smp_snoop. If the capability bit for this event is set in dev-caps, the event is used (by the driver setting the PORT_MNG_CHG_EVENT bit in the async event mask in the MAP_EQ fw command). In this case, when the driver passes incoming SMP PORT_INFO SET mads to the FW, the FW generates port management change events to signal any changes to the driver. If the FW generates these events, smp_snoop shouldn't be invoked in ib_process_mad(), or duplicate events will occur (once from the FW-generated event, and once from smp_snoop). In the case where the FW does not generate port management change events smp_snoop needs to be invoked to create these events. The flow in smp_snoop has been modified to make use of the same procedures as in the fw-generated-event event case to generate the port management events (LID change, Client-rereg, Pkey change, and/or GID change). Port management change event handling required changing the mlx4_ib_event and mlx4_dispatch_event prototypes; the "param" argument (last argument) had to be changed to unsigned long in order to accomodate passing the EQE pointer. We also needed to move the definition of struct mlx4_eqe from net/mlx4.h to file device.h -- to make it available to the IB driver, to handle port management change events. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-07-10 09:47:10 -07:00
Jack Morgenstein	752a50cab6	mlx4_core: Pass an invalid PCI id number to VFs Currently, VFs have 0 in their dev->caps.function field. This is a valid pci id (usually of the PF). Instead, pass an invalid PCI id to the VF via QUERY_FW, so that if the value gets accessed in the VF driver, we'll catch the problem. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-07-08 18:05:05 -07:00
Hadar Hen Zion	cabdc8ee37	net/mlx4_en: Add support for drop action through ethtool The drop action is implemented by allocating a QP and keeping it in a reset state such that the HW drops any packets which are steered to that QP. When a drop action is requested, we attach the relevant flow to that QP. Sign-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:06 -07:00
Hadar Hen Zion	820672812f	net/mlx4_en: Manage flow steering rules with ethtool Implement the ethtool APIs for attaching L2/L3/L4 based flow steering rules to the netdevice RX rings. Added set_rxnfc callback and enhanced the existing get_rxnfc callback. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:06 -07:00
Hadar Hen Zion	592e49dda8	net/mlx4: Implement promiscuous mode with device managed flow-steering The device managed flow steering API has three promiscuous modes: 1. Uplink - captures all the packets that arrive to the port. 2. Allmulti - captures all multicast packets arriving to the port. 3. Function port - for future use, this mode is not implemented yet. Use these modes with the flow_attach and flow_detach firmware commands according to the promiscuous state of the netdevice. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:06 -07:00
Hadar Hen Zion	1b9c6b064e	net/mlx4_core: Add resource tracking for device managed flow steering rules As with other device resources, the resource tracker is needed for supporting device managed flow steering rules under SRIOV: make sure virtual functions delete only rules created by them, and clean all rules attached by a crashed VF. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:06 -07:00
Hadar Hen Zion	0ff1fb654b	{NET, IB}/mlx4: Add device managed flow steering firmware API The driver is modified to support three operation modes. If supported by firmware use the device managed flow steering API, that which we call device managed steering mode. Else, if the firmware supports the B0 steering mode use it, and finally, if none of the above, use the A0 steering mode. When the steering mode is device managed, the code is modified such that L2 based rules set by the mlx4_en driver for Ethernet unicast and multicast, and the IB stack multicast attach calls done through the mlx4_ib driver are all routed to use the device managed API. When attaching rule using device managed flow steering API, the firmware returns a 64 bit registration id, which is to be provided during detach. Currently the firmware is always programmed during HCA initialization to use standard L2 hashing. Future work should be done to allow configuring the flow-steering hash function with common, non proprietary means. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:05 -07:00
Hadar Hen Zion	8fcfb4db74	net/mlx4_core: Add firmware commands to support device managed flow steering Add support for firmware commands to attach/detach a new device managed steering mode. Such network steering rules allow the user to provide an L2/L3/L4 flow specification to the firmware and have the device to steer traffic that matches that specification to the provided QP. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:05 -07:00
Hadar Hen Zion	c96d97f4d1	net/mlx4: Set steering mode according to device capabilities Instead of checking the firmware supported steering mode in various places in the code, add a dedicated field in the mlx4 device capabilities structure which is written once during the initialization flow and read across the code. This also set the grounds for add new steering modes. Currently two modes are supported, and are named after the ConnectX HW versions A0 and B0. A0 steering uses mac_index, vlan_index and priority to steer traffic into pre-defined range of QPs. B0 steering uses Ethernet L2 hashing rules and is enabled only if the firmware supports both unicast and multicast B0 steering, The current steering modes are relevant for Ethernet traffic only, such that Infiniband steering remains untouched. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:05 -07:00
Yevgeny Petrilin	6d19993788	net/mlx4_en: Re-design multicast attachments flow Currently, for every change in the net device multicast list, the driver detaches all the addresses from the HW device, and then attaches the updated list. This behavior is wrong from two aspects: first, it causes a load of firmware commands and second, there is period of time where the correct addresses are not attached, which turned into packet loss. To improve - a copy of the multicast list is saved by the driver. For every change in the multicast list, the multicast list copy is used to find the delta between those two lists and add or remove multicast addresses as needed. Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:05 -07:00
Hadar Hen Zion	aa1ec3dde1	net/mlx4_core: Change resource tracking ID to be 64 bit Currently the IDs used by the resource tracker are of type u32, so far this was ok since all the different resources we were tracking could be encoded in 32bit. As a preparation step for tracking of resources whose IDs need > 32 bits such as network flow steering rules, who are 64 bit in size, move to use 64 bit based resource IDs. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:05 -07:00
Hadar Hen Zion	4af1c0488d	net/mlx4_core: Change resource tracking mechanism to use red-black tree Change the data structure used for managing the SRIOV resource tracking mechanism from radix tree to red-black tree. This is preparation step for supporting resource IDs which are 64bit long, such as network flow steering rules. Such IDs can't be used as radix-tree keys on 32bit architectures and hence the reason for the change. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:05 -07:00
Yuval Mintz	90b1ebe7af	mlx4: set maximal number of default RSS queues Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-05 03:06:44 -07:00
David S. Miller	b26d344c6b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/caif/caif_hsi.c drivers/net/usb/qmi_wwan.c The qmi_wwan merge was trivial. The caif_hsi.c, on the other hand, was not. It's a conflict between `1c385f1fdf` ("caif-hsi: Replace platform device with ops structure.") in the net-next tree and commit `39abbaef19` ("caif-hsi: Postpone init of HIS until open()") in the net tree. I did my best with that one and will ask Sjur to check it out. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-28 17:37:00 -07:00
Yevgeny Petrilin	044ca2a5f2	net/mlx4_en: Release QP range in free_resources Add a missing resource release in ring cleanup. Not doing this leaves a range of QPs that are being reserved, and no one can use them. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 16:30:12 -07:00
Yevgeny Petrilin	9858d2d1ac	net/mlx4: Use single completion vector after NOP failure Fix a crash at the error flow of NOP command which caused the driver to try and use a completion vector which wasn't allocated. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 16:30:12 -07:00
Yevgeny Petrilin	5c8e904666	net/mlx4_en: Set correct port parameters during device initialization Set valid port parameters: MTU and flow control configuration when configuring the port during HW device initialization, prior to the net device open() being called. Using invalid parameters (such as all zeros) could lead to bad firmware behavior. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 16:30:12 -07:00
David S. Miller	7e52b33bd5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/ipv6/route.c This deals with a merge conflict between the net-next addition of the inetpeer network namespace ops, and Thomas Graf's bug fix in `2a0c451ade` which makes sure we don't register /proc/net/ipv6_route before it is actually safe to do so. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-15 15:51:55 -07:00
Linus Torvalds	ae501be0f6	InfiniBand/RDMA fixes for 3.5-rc2, all in hardware drivers: - Fix crash in cxgb4 - Fixes to new ocrdma driver - Regression fixes for mlx4 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJPz5V+AAoJEENa44ZhAt0hsXoQAIFBNbsLWLft4J9jpyDAFj5z SpspcZAVgJW+K7I6j38SdQXMsR/XmcwM8Yh3iJIa/YG0P/6bWXDNU20itN62mAcI D+LphECUv/T8PGxEcAFlwjLYy5kcgFBTs/t7AgVS57OL4z8YuDLDr7uhdcnPdq7t AMIptqhXZgSnM/peIkf2SWBYxHSUMhukrlu5uNB9uc9GC9VRGhioXxT3eBPwr54O 7BwHxQGDYtFZ4fYKmFxN9sJJmFf9nl/WhYM2HwCMPEe4UzlxxfbP7/c17QNS7YGo e072Jvs+U2ttb4J7J2yBRcpiiSjYDAVi+7fE9OtAdWyfPDa9MmcajVq7PUnohZLc muNSv1RGOebj2mSjE+oc1oWZKkM/nSEwbIE7PJOvWX2RdqOs8xNNC0RjgvsXDvyf +lARPcyolxpJXtvIlLY/Ua1KhJf52zu+Kp1QCbEnloU/NuGcc09It6Wq6X1yU/Gs N29qjuFOBoHQxDzfWPc3Uhp1/eNI8UJiTfO9CSYHv2ZHJzW8BqoblbKlE2l43TDg frH1l1/9RYY7gTGoCVvfJByvSWM7rBJdNfpu+yJDm5en86xcF6HB7GG0nJ/okpw2 dUqQsLu+nTKZn8KCM3jzgP/ANT+PWk1UZlS1fIaMFMwa1SLWYqbaW/KEbRrY5IeP H0q/TFhJH8PccBjPbcsT =zl8u -----END PGP SIGNATURE----- Merge tag 'rdma-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA fixes from Roland Dreier: "All in hardware drivers: - Fix crash in cxgb4 - Fixes to new ocrdma driver - Regression fixes for mlx4" * tag 'rdma-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Fix max_wqe capacity reported from query device mlx4_core: Fix setting VL_cap in mlx4_SET_PORT wrapper flow IB/mlx4: Fix EQ deallocation in legacy mode RDMA/cxgb4: Fix crash when peer address is 0.0.0.0 RDMA/ocrdma: Remove unnecessary version.h includes RDMA/ocrdma: Fix signaled event for SRQ_LIMIT_REACHED RDMA/ocrdma: Correct queue free count math	2012-06-06 10:45:21 -07:00
Jack Morgenstein	edc4a67e15	mlx4_core: Fix setting VL_cap in mlx4_SET_PORT wrapper flow Commit `096335b3f9` ("mlx4_core: Allow dynamic MTU configuration for IB ports") modifies the port VL setting. This exposes a bug in mlx4_common_set_port(), where the VL cap value passed in (inside the command mailbox) is incorrectly zeroed-out: mlx4_SET_PORT modifies the VL_cap field (byte 3 of the mailbox). Since the SET_PORT command is paravirtualized on the master as well as on the slaves, mlx4_SET_PORT_wrapper() is invoked on the master. This calls mlx4_common_set_port() where mailbox byte 3 gets overwritten by code which should only set a single bit in that byte (for the reset qkey counter flag) -- but instead overwrites the entire byte. The result is that when running in SR-IOV mode, the VL_cap will be set to zero -- fix this. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-06-06 10:07:54 -07:00
Joe Perches	6469933605	ethernet: Remove casts to same type Adding casts of objects to the same type is unnecessary and confusing for a human reader. For example, this cast: int y; int p = (int )&y; I used the coccinelle script below to find and remove these unnecessary casts. I manually removed the conversions this script produces of casts with __force, __iomem and __user. @@ type T; T p; @@ - (T )p + p A function in atl1e_main.c was passed a const pointer when it actually modified elements of the structure. Change the argument to a non-const pointer. A function in stmmac needed a __force to avoid a sparse warning. Added it. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-06 09:31:33 -07:00
Jack Morgenstein	401453a31e	net/mlx4_core: Fix obscure mlx4_cmd_box parameter in QUERY_DEV_CAP The "!mlx4_is_slave" is totally confusing. Fix with constant MLX4_CMD_NATIVE, which is the intended behavior. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-31 18:18:16 -04:00
Jack Morgenstein	6230bb234d	net/mlx4_core: Check port out-of-range before using in mlx4_slave_cap The range check was performed after using the port number. Reverse this to prevent a potential array overflow. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-31 18:18:16 -04:00
Jack Morgenstein	b91cb3ebcd	net/mlx4_core: Fixes for VF / Guest startup flow - pass the following parameters: - firmware version (added QUERY_FW paravirtualization for that) - disable Blueflame on slaves. KVM disables write combining on guests, and we get better performance without BF in this case. (This requires QUERY_DEV_CAP paravirtualization, also in this commit) - max qp rdma as destination - get rid of a chunk of "if (0)" dead code Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-31 18:18:16 -04:00
Jack Morgenstein	13bf58b760	net/mlx4_en: Fix improper use of "port" parameter in mlx4_en_event Port is used as an array index before we know if that is proper. For example, in the catas event case, port is zero; however, the port index should lie in the range (1..2). Fix this by using 'port' only in the events where it is of interest. Test for port out of range in the default (unhandled event) case, and do not output a message if it is not an ethernet port. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-31 18:18:16 -04:00
Marcel Apfelbaum	3fc929e2d6	net/mlx4_core: Fix number of EQs used in ICM initialisation In SRIOV mode, the number of EQs used when computing the total ICM size was incorrect. To fix this, we do the following: 1. We add a new structure to mlx4_dev, mlx4_phys_caps, to contain physical HCA capabilities. The PPF uses the phys capabilities when it computes things like ICM size. The dev_caps structure will then contain the paravirtualized values, making bookkeeping much easier in SRIOV mode. We add a structure rather than a single parameter because there will be other fields in the phys_caps. The first field we add to the mlx4_phys_caps structure is num_phys_eqs. 2. In INIT_HCA, when running in SRIOV mode, the "log_num_eqs" parameter passed to the FW is the number of EQs per VF/PF; each function (PF or VF) has this number of EQs available. However, the total number of EQs which must be allowed for in the ICM is (1 << log_num_eqs) * (#VFs + #PFs). Rather than compute this quantity, we allocate ICM space for 1024 EQs (which is the device maximum number of EQs, and which is the value we place in the mlx4_phys_caps structure). For INIT_HCA, however, we use the per-function number of EQs as described above. Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-31 18:18:16 -04:00
Jack Morgenstein	30f7c73bed	net/mlx4_core: Fix the slave_id out-of-range test in mlx4_eq_int Ths fixes the comparison in the FLR (Function Level Reset) event case. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-31 18:18:15 -04:00
Linus Torvalds	c23ddf7857	InfiniBand/RDMA changes for the 3.5 merge window: - Add ocrdma hardware driver for Emulex IB-over-Ethernet adapters - Add generic and mlx4 support for "raw" QPs: allow suitably privileged applications to send and receive arbitrary packets directly to/from the hardware - Add "doorbell drop" handling to the cxgb4 driver - A fairly large batch of qib hardware driver changes - A few fixes for lockdep-detected issues - A few other miscellaneous fixes and cleanups -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJPumbZAAoJEENa44ZhAt0h8LgP/0fXe7Szm3n6P6UvMAVqkagM 4PpreH3mpWUFpzqeQE1JPDtgx700R6aPipbHqgIN+k61RWMpLjICGcNx7iwxn1I+ zqdquGygWgjceLz+BLVlk+iBmJt3vZ3fPRAXc7fdP+jhIarWkNIOy1pXWTUuRvED jL8jIaxhCcgAVzm/zNyt6IPxkaHvCz7K9wqmpyU0dsO9OyPdGvWA9+CkGXwmOCPq mxSVhWnfGsMkPBsL7EgTC5KP/ox2PKq6rFgysmVVS+rKCpP0L8BEVQyGX3Gf8KA8 yV+KdTi9ofDnFrv6R7Wz0v7HRUih8GRssakzBu7Y7HLfK1M/QwMG0GUAibXGZObc vUXuQ3uRJ/cIzMPXqKeGYwpb5t+TmxyjhWu44OjCUQkNau91+9BSbA69S88KXc49 aTJiCZlhPuGf4uGMWJJuPLcE2xO2QCZj+8ckL2STYrIip6GWlCH02kJaQmRkuWH2 UhMOeJDBC4nvh4EQT/WwHpGzyhkavE2ayfo5YemxBJXo+P5Mdbf7WIDRQDLUEeQH F8sPoccH4hDiAorN/SkTsm14jVTP7oWW1M40Ont59Nhbgm88MsVkvjoneHnfBvbD HjK92soCWnYTAoREfj0G4xUxZgMdOZcezWrX0rx5LJ8Ju9y4zAi3cKGr7lg6hs4X syKfN0VjiDRtJ+pxayi3 =yWfr -----END PGP SIGNATURE----- Merge tag 'rdma-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA changes from Roland Dreier: - Add ocrdma hardware driver for Emulex IB-over-Ethernet adapters - Add generic and mlx4 support for "raw" QPs: allow suitably privileged applications to send and receive arbitrary packets directly to/from the hardware - Add "doorbell drop" handling to the cxgb4 driver - A fairly large batch of qib hardware driver changes - A few fixes for lockdep-detected issues - A few other miscellaneous fixes and cleanups Fix up trivial conflict in drivers/net/ethernet/emulex/benet/be.h. * tag 'rdma-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (53 commits) RDMA/cxgb4: Include vmalloc.h for vmalloc and vfree IB/mlx4: Fix mlx4_ib_add() error flow IB/core: Fix IB_SA_COMP_MASK macro IB/iser: Fix error flow in iser ep connection establishment IB/mlx4: Increase the number of vectors (EQs) available for ULPs RDMA/cxgb4: Add query_qp support RDMA/cxgb4: Remove kfifo usage RDMA/cxgb4: Use vmalloc() for debugfs QP dump RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queues RDMA/cxgb4: Disable interrupts in c4iw_ev_dispatch() RDMA/cxgb4: Add DB Overflow Avoidance RDMA/cxgb4: Add debugfs RDMA memory stats cxgb4: DB Drop Recovery for RDMA and LLD queues cxgb4: Common platform specific changes for DB Drop Recovery cxgb4: Detect DB FULL events and notify RDMA ULD RDMA/cxgb4: Drop peer_abort when no endpoint found RDMA/cxgb4: Always wake up waiters in c4iw_peer_abort_intr() mlx4_core: Change bitmap allocator to work in round-robin fashion RDMA/nes: Don't call event handler if pointer is NULL RDMA/nes: Fix for the ORD value of the connecting peer ...	2012-05-21 17:54:55 -07:00
Amir Vadai	bc6a4744b8	net/mlx4_en: num cores tx rings for every UP Change the TX ring scheme such that the number of rings for untagged packets and for tagged packets (per each of the vlan priorities) is the same, unlike the current situation where for tagged traffic there's one ring per priority and for untagged rings as the number of core. Queue selection is done as follows: If the mqprio qdisc is operates on the interface, such that the core networking code invoked the device setup_tc ndo callback, a mapping of skb->priority => queue set is forced - for both, tagged and untagged traffic. Else, the egress map skb->priority => User priority is used for tagged traffic, and all untagged traffic is sent through tx rings of UP 0. The patch follows the convergence of discussing that issue with John Fastabend over this thread http://comments.gmane.org/gmane.linux.network/229877 Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Liran Liss <liranl@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-17 16:17:50 -04:00
Jack Morgenstein	eb71d0d63f	net/mlx4_core: Fixed error flow in rem_slave_eqs Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 00:56:59 -04:00
Jack Morgenstein	ba062d5219	net/mlx4_core: Add XRC domains and counters to resource tracker Add missing resource tracking for XRC domains and complete the tracking for HCA network flow counters. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 00:56:59 -04:00
Jack Morgenstein	b8924951f6	net/mlx4_core: Fix potential kernel Oops in res tracker during Dom0 driver unload Currently the slave and master resources are deleted after master freed all bitmaps. If any resources were not properly cleaned up during the shutdown process, an Oops would result. Fix so that delete slave (only) resources during cleanup. Master resources are cleaned up during unload process, and need not separately be cleaned. Note that during cleanup, we need to split the resource-tracker freeing functionality. Before removing all the bitmaps, we free any leftover slave resources. However, we can only remove the resource tracker linked list after all bitmap frees, since some of the freeing functions (e.g., mlx4_cleanup_eq_table) use paravirtualized FW commands which expect the resource tracker linked list to be present. Found-by: Aviad Yehezkel <aviadye@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 00:56:58 -04:00
Jack Morgenstein	681372a7a3	net/mlx4_core: Do not reset module-parameter num_vfs when fail to enable sriov Consider the following scenario: 2 HCAs, where only one of which can run SRIOV. If we reset the module parameter, all the VFs of the SRIOV HCA will be claimed by the PPF host (-- the code relies on num_vfs being non-zero to avoid this claiming, and num_vfs was reset when pci_enable_sriov failed for the non-SRIOV HCA). The solution is not to touch the num_vfs parameter. Also, eliminate the unneeded check of num_vfs when disabling sriov (the dev flag bit is sufficient). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 00:56:58 -04:00
Jack Morgenstein	b9985f410a	net/mlx4_core: Remove unused _str functions from the resource tracker Removed unsued _str helper functions from resource_tracker.c Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 00:56:58 -04:00
Jack Morgenstein	5e92d803bf	net/mlx4_core: Change SYNC_TPT to be native (not wrapped) The "wrapped" was incorrect, since no wrapper function was defined. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 00:56:58 -04:00
Jack Morgenstein	8bac9ede68	net/mlx4_core: Fix init_port mask state for slaves In function mlx4_INIT_PORT_wrapper, the port state mask for the slave is only set if we are invoking the INIT_PORT fw command. However, the reference count for the (initialized) port is incremented anyway. This creates a problem in that when we have multiple slaves, then the CLOSE_PORT command will never be invoked. The reason is that in the CLOSE_PORT wrapper, if the port-state mask is zero for the slave (which it is), the wrapper returns without doing anything. The only slave which will not return immediately in the CLOSE_PORT wrapper is that slave for which INIT_PORT was invoked. The fix is to not have the port-state mask setting depend on the logic for calling the INIT_PORT fw command. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 00:56:58 -04:00
Or Gerlitz	162344ed2c	net/mlx4: Address build warnings on set but not used variables Handle the compiler warnings on variables which are set but not used by removing the relevant variable or casting a return value which is ignored on purpose to void. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-16 00:56:58 -04:00
Jack Morgenstein	f4ec9e9531	mlx4_core: Change bitmap allocator to work in round-robin fashion Under most circumstances, the bitmap allocator does not allocate the same full 24-bit QP number immediately after a QP is destroyed. This works by using the upper bits of a 24-bit QP number, beyond the number of QPs that are actually available in the low level driver. For example, say that the HCA is willing to allocate a maximum of 64K qps. We use the bits 23..16 as a "counter" which is incremented by 1 at each allocation so that even if the same physical QP is re-allocated, it will not receive the same 24-bit QP number. However, we have seen the following scenario: 1. Allocate, say, 255 QPs in succession. This will cause a wrap of the "counter". 2. Destroy the first QP allocated, then allocate a new QP. The new QP, because of the counter wraparound, will get the same FULL QP number as the QP just destroyed! This is a problem because packets in transit can be erroneously delivered to the new QP when they were meant for the old (destroyed) QP, because the full QP number of the new QP is identical to the destroyed QP. (The "counter" mechanism is meant to prevent this by having the full 24-bit QP numbers differ even if the physical QP on the HCA is the same. As we see above, however, this mechanism does not always work). The best fix for this problem is to allocate QPs in round-robin mode, so that the physical QP numbers are not immediately re-used. Found-by: Matthew Finlay <matt@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-05-14 13:44:38 -07:00
Shlomo Pongratz	b3416f4476	mlx4_core: Add second capabilities flags field This patch adds a 64-bit flags2 features member to struct mlx4_dev to export further features of the hardware. The original flags field tracks features whose support bits are advertised by the firmware in offsets 0x40 and 0x44 of the query device capabilities command. flags2 will track features whose support bits are scattered at various offsets. RSS support is the first feature to be exported through flags2. RSS capabilities are located at offset 0x2e. The size of the RSS indirection table is also given in this offset. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-05-08 11:54:32 -07:00
Yevgeny Petrilin	5b263f5374	mlx4_en: Byte Queue Limit support Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:34:02 -04:00
Yevgeny Petrilin	e22979d96a	mlx4_en: Moving to Interrupts for TX completions Moving to interrupts instead of polling fpr TX completions Avoiding situations where skb can be held in by the driver for a long time (till timer expires). The change is also necessary for supporting BQL. Removing comp_lock that was required because we could handle TX completions from several contexts: Interrupts, timer, polling. Now there is only interrupts Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:34:02 -04:00
Yevgeny Petrilin	a19a848a45	mlx4_en: Added Ethtool support for TX Interrupt coalescing Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:34:02 -04:00
Paul Gortmaker	43c880dff3	drivers/net: fix unresolved 64bit math in mellanox/mlx4/en_dcb_nl.c Commit `109d244605` "net/mlx4_en: Set max rate-limit for a TC" introduced 64 bit math operations into mlx4_en_dcbnl_ieee_setmaxrate() causing the following final link failure on an x86_32 allmodconfig ERROR: "__udivdi3" [drivers/net/ethernet/mellanox/mlx4/mlx4_en.ko] undefined! Convert it to use div_u64() instead. Cc: Amir Vadai <amirv@mellanox.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-16 02:12:11 -04:00
Masanari Iida	fd9071ec61	net: Fix spelling typo in net Correct spelling typo within drivers/net. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-14 15:29:02 -04:00
David S. Miller	06eb4eafbd	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-04-10 14:30:45 -04:00
Amir Vadai	109d244605	net/mlx4_en: Set max rate-limit for a TC This patch is using the DCB netlink to set rate limit per ETS TC Values are accepted in Kbps and rounded up to the nearest multiply of 100Mbps. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 05:08:04 -04:00
Amir Vadai	897d7846b4	net/mlx4_en: sk_prio <=> UP for untagged traffic Since vlan egress map is only good for tagged traffic, need to have other mapping to be used by untagged traffic. For that, the driver uses sch_mqprio mapping. This mapping could be set by using tc tool from iproute2 package. Mapped UP will be used by the HW for QoS purposes, but won't go out on the wire. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 05:08:04 -04:00
Amir Vadai	564c274c3d	net/mlx4_en: DCB QoS support Set TSA, promised BW and PFC using IEEE 802.1qaz netlink commands. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 05:08:04 -04:00
Amir Vadai	e5395e92a4	net/mlx4_core: set port QoS attributes Adding QoS firmware commands: - mlx4_en_SET_PORT_PRIO2TC - set UP <=> TC - mlx4_en_SET_PORT_SCHEDULER - set promised BW, max BW and PG number Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 05:08:03 -04:00
Amir Vadai	0e98b523c4	net/mlx4_en: Force user priority by QP attribute Instead of relying on HW to change schedule queue by UP, schedule queue is fixed for a tx_ring, and UP in WQE is ignored in this aspect. This resolves two issues with untagged traffic: 1. untagged traffic has no UP in packet which is needed for QoS. The change above allows setting the schedule queue (and by that the UP) of such a stream. 2. BlueFlame uses the same field used by vlan tag. So forcing UP from QPC allows using BF for untagged but prioritized traffic. In old firmware that force UP is not supported, untagged traffic will not subject to QoS. Because UP is set by QP, need to always have a tx ring per UP, even if pfcrx module paramter is false. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 05:08:03 -04:00
Thadeu Lima de Souza Cascardo	117980c4c9	mlx4: allocate just enough pages instead of always 4 pages The driver uses a 2-order allocation, which is too much on architectures like ppc64, which has a 64KiB page. This particular allocation is used for large packet fragments that may have a size of 512, 1024, 4096 or fill the whole allocation. So, a minimum size of 16384 is good enough and will be the same size that is used in architectures of 4KiB sized pages. This will avoid allocation failures that we see when the system is under stress, but still has plenty of memory, like the one below. This will also allow us to set the interface MTU to higher values like 9000, which was not possible on ppc64 without this patch. Node 1 DMA: 73764kB 37128kB 0256kB 0512kB 01024kB 02048kB 04096kB 08192kB 0*16384kB = 51904kB 83137 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 10420096kB Total swap = 10420096kB 107776 pages RAM 1184 pages reserved 147343 pages shared 28152 pages non-shared netstat: page allocation failure. order:2, mode:0x4020 Call Trace: [c0000001a4fa3770] [c000000000012f04] .show_stack+0x74/0x1c0 (unreliable) [c0000001a4fa3820] [c00000000016af38] .__alloc_pages_nodemask+0x618/0x930 [c0000001a4fa39a0] [c0000000001a71a0] .alloc_pages_current+0xb0/0x170 [c0000001a4fa3a40] [d00000000dcc3e00] .mlx4_en_alloc_frag+0x200/0x240 [mlx4_en] [c0000001a4fa3b10] [d00000000dcc3f8c] .mlx4_en_complete_rx_desc+0x14c/0x250 [mlx4_en] [c0000001a4fa3be0] [d00000000dcc4eec] .mlx4_en_process_rx_cq+0x62c/0x850 [mlx4_en] [c0000001a4fa3d20] [d00000000dcc5150] .mlx4_en_poll_rx_cq+0x40/0x90 [mlx4_en] [c0000001a4fa3dc0] [c0000000004e2bb8] .net_rx_action+0x178/0x450 [c0000001a4fa3eb0] [c00000000009c9b8] .__do_softirq+0x118/0x290 [c0000001a4fa3f90] [c000000000031df8] .call_do_softirq+0x14/0x24 [c000000184c3b520] [c00000000000e700] .do_softirq+0xf0/0x110 [c000000184c3b5c0] [c00000000009c6d4] .irq_exit+0xb4/0xc0 [c000000184c3b640] [c00000000000e964] .do_IRQ+0x144/0x230 Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Tested-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-04 20:34:29 -04:00
Linus Torvalds	0c2fe82a9b	InfiniBand/RDMA changes for the 3.4 merge window. Nothing big really stands out; by patch count lots of fixes to the mlx4 driver plus some cleanups and fixes to the core and other drivers. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJPZ2THAAoJEENa44ZhAt0hMgkP/AwXjwzz6lYlIizytQ5KOH11 CT/VoHLIGIwY31aRhZRHggWLEbJi8akgfh04K9PiSh/muFRPYk5KJDoQEoJV5Odv 12krqtpZeL41YcAPq0wiyP3Ki5AZDCDumzWbOtmxE9m93mVfi3yFTTWDHFo/7SOx A2kUITdKuZhgBpCFYS23Gs8QTWcMPUlwNlM76edG90BjSySHsK15tbqTUaEOC6Ux SPG0p0c2YjlZCPmRObrCL65o5LFRVajinZ4rXN7rre4LEU+IuXgW+mQU2Kwy8z6a oMPE2l4OMSqj270r2PlVK087gGH69nKfLgLQLJiP0Id+KwdYUk7vQcA+Fu0jwjP3 Ci/ahllvB76IiaNbax0vTy2ohCJmilLLotAP46NW0vPR2/KnmVy0Mqk8pXQQddw4 tnAFNnafn/MjTty8GwqyXR/enJZs29ePcSxcYnKjXYRIgG4ldejL2wktgSra8+gb 3/qFag1HRgZzcICkXGpHj7uWJ2KDe++1m6KTb0THUmrjuiel6ATFZpYoeRed3GfS ZMqpjZvuXM8nA9CrKMkzm5vIiUFF8Qq0hUTLR38F8FiNEhmC08JqH5PqjJ3Z18if R1yG4W4xmp/0ICHg4zyg6LWV3YsweDD+jSCJpW+KNBvu3HtYb41HIlK+i2TTmtPD 3etEZZRwROAyYKcwN01p =DbLx -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA changes for the 3.4 merge window from Roland Dreier: "Nothing big really stands out; by patch count lots of fixes to the mlx4 driver plus some cleanups and fixes to the core and other drivers." * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (28 commits) mlx4_core: Scale size of MTT table with system RAM mlx4_core: Allow dynamic MTU configuration for IB ports IB/mlx4: Fix info returned when querying IBoE ports IB/mlx4: Fix possible missed completion event mlx4_core: Report thermal error events mlx4_core: Fix one more static exported function IB: Change CQE "csum_ok" field to a bit flag RDMA/iwcm: Reject connect requests if cmid is not in LISTEN state RDMA/cxgb3: Don't pass irq flags to flush_qp() mlx4_core: Get rid of redundant ext_port_cap flags RDMA/ucma: Fix AB-BA deadlock IB/ehca: Fix ilog2() compile failure IB: Use central enum for speed instead of hard-coded values IB/iser: Post initial receive buffers before sending the final login request IB/iser: Free IB connection resources in the proper place IB/srp: Consolidate repetitive sysfs code IB/srp: Use pr_fmt() and pr_err()/pr_warn() IB/core: Fix SDR rates in sysfs mlx4: Enforce device max FMR maps in FMR alloc IB/mlx4: Set bad_wr for invalid send opcode ...	2012-03-21 10:33:42 -07:00
Eugenia Emantayev	58a3de0592	mlx4_core: fix race on comm channel Prevent race condition between commands on comm channel. Happened while unloading the driver when switching from event to polling mode. VF got completion on the last command before switching to polling mode, but toggle was not changed. After the fix - VF will not write the next command before toggle is updated. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-19 18:02:05 -04:00
Roland Dreier	42872c7a5e	Merge branches 'misc' and 'mlx4' into for-next Conflicts: drivers/infiniband/hw/mlx4/main.c drivers/net/ethernet/mellanox/mlx4/main.c include/linux/mlx4/device.h	2012-03-12 16:25:28 -07:00
Roland Dreier	db5a7a65c0	mlx4_core: Scale size of MTT table with system RAM The current driver defaults to 1M MTT segments, where each segment holds 8 MTT entries. This limits the total memory registered to 8M * PAGE_SIZE which is 32GB with 4K pages. Since systems that have much more memory are pretty common now (at least among systems with InfiniBand hardware), this limit ends up getting hit in practice quite a bit. Handle this by having the driver allocate at least enough MTT entries to cover 2 * totalram pages. Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-03-12 16:24:59 -07:00
Or Gerlitz	096335b3f9	mlx4_core: Allow dynamic MTU configuration for IB ports Set the MTU for IB ports in the driver instead of using the firmware default of 2KB (the driver defaults to 4KB). Allow for dynamic mtu configuration through a new, per-port sysfs entry. Since there's a dependency between the port MTU and the max number of HW VLs the port can support, apply a mim/max approach, using a loop that goes down from the highest possible number of VLs to the lowest, using the firmware return status to know whether the requested number of VLs is possible with a given MTU. For now, as with the dynamic link type change / VPI support, the sysfs entry to change the mtu is exposed only when NOT running in SR-IOV mode. To allow changing the MTU for the master in SR-IOV mode, primary-function-initiated FLR (Function Level Reset) needs to be implemented. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-03-12 16:24:59 -07:00
Jack Morgenstein	5984be9004	mlx4_core: Report thermal error events Print an error message when a thermal error async event is reported by the HW. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Dotan Barak <dotanb@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-03-12 16:24:59 -07:00
Roland Dreier	e10903b087	mlx4_core: Fix one more static exported function Commit `22c8bff6fa` ("mlx4_core: Exported functions can't be static") fixed most of this up, but forgot about mlx4_is_slave_active(). Fix this one too. Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-03-12 16:24:58 -07:00
David S. Miller	b2d3298e09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-03-09 14:34:20 -08:00
Jack Morgenstein	dcf353b170	mlx4_core: fix bug in modify_cq wrapper for resize flow. The actual FW command is called in procedure "handle_resize". Code incorrectly invoked the FW command again (in good flow), in the modify_cq wrapper function. Fix by skipping second FW invocation unconditionally for resize. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-08 00:28:01 -08:00
Or Gerlitz	8154c07fe1	mlx4_core: Get rid of redundant ext_port_cap flags While doing the work for commit `a6f7feae6d` ("IB/mlx4: pass SMP vendor-specific attribute MADs to firmware") we realized that the firmware would respond on all sorts of vendor-specific MADs. Therefore commit `97285b7817` ("mlx4_core: Add extended port capabilities support") adds redundant code into the driver, since there's no real reaon to maintain the extended capabilities of the port, as they can be queried on demand (e.g the FDR10 capability). This patch reverts commit `97285b7817` and removes the check for extended caps from the mlx4_ib driver port query flow. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-03-06 17:25:18 -08:00
Yevgeny Petrilin	66431a7d45	net/mlx4: defining functions as static Fixing sparse warnings, the 2 functions are only used in same file. Defining them as static and not exporting them. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:18 -05:00
Yevgeny Petrilin	be6736ba1f	net/mlx4: remove unused functions Removing functions that are no longer in use, but still exist Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:18 -05:00
Yevgeny Petrilin	9a9a232a92	net/mlx4: fixing sparse warnings for not declared, functions The SET_PORT functions are implemented in port.c, which is part of mlx4_core, these functions are exported. The functions are in use by the mlx4_en module (were originally part of mlx4_en). Their declaration remained in mlx4_en module, moving the declaration to the right location. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:18 -05:00
Yevgeny Petrilin	2ab573c586	net/mlx4: fixing sparse warnings when copying mac, address to gid entry The mac should be written as __be64 the gid. The warning was because we changed the mac parameter, which is u64, by writing result of cpu_to_be64 into it. Fixing by using new variable of type __be64. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Yevgeny Petrilin	39b2c4ebb4	net/mlx4: fix sparse warnings on wrong type for RSS keys The keys used for the hardware RSS topelitz hash are of type __be32 where the values provided by the driver are from array of u32, this triggered sparse warning on incorrect type in assignment as of different base types. Since these values are picked randomly, the fix is to transform the key to __be32 by executing cpu_to_be_32() Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Or Gerlitz	966684d581	net/mlx4: fix sparse warnings on TX blue flame buffer The blue flame buffer is defined to be of type void __iomem * but was passed to mlx4_bf_copy which gets unsigned long * . This triggered sparse warning on different address spaces, fix that by changing mlx4_bf_copy first param to be of type void __iomem * . Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Or Gerlitz	4ef2a435be	net/mlx4: fix sparse warnings on TX control flags, endianess Fix sparse warnings on incompatibility between the endianess of the ctrl_flags field of struct mlx4_en_priv to the srcrb_flags field of struct mlx4_wqe_ctrl_seg by changing the former to be __be32 instead of u32. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Yevgeny Petrilin	ebf8c9aa03	net/mlx4_en: Saving mem access on data path Localized the pdev->dev, and using dma_map instead of pci_map There are multiple map/unmap operations on data path, optimizing those by saving redundant pointer access. Those places were identified as hot-spots when running kernel profiling during some benchmarks. The fixes had most impact when testing packet rate with small packets, reducing several % from CPU load, and in some case being the difference between reaching wire speed or being CPU bound. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
Yevgeny Petrilin	1d4526e037	mlx4_core: remove buggy sched_queue masking Fixes a bug introduced by commit `fe9a2603c`, where the priority bits in the schedule queue field were masked out. Signed-off-by: Amir Vadai <amirv@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 14:43:50 -05:00
Eric Dumazet	18f973af3e	mlx4_en: remove sparse errors Fix new sparse errors introduced in commit `6221217199` (mlx4_en: dont change mac_header on xmit) Reported-by: Or Gerlitz <or.gerlitz@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-05 16:42:45 -05:00
David S. Miller	ff4783ce78	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/sfc/rx.c Overlapping changes in drivers/net/ethernet/sfc/rx.c, one to change the rx_buf->is_page boolean into a set of u16 flags, and another to adjust how ->ip_summed is initialized. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-26 21:55:51 -05:00
Linus Torvalds	203738e548	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 1) ICMP sockets leave err uninitialized but we try to return it for the unsupported MSG_OOB case, reported by Dave Jones. 2) Add new Zaurus device ID entries, from Dave Jones. 3) Pointer calculation in hso driver memset is wrong, from Dan Carpenter. 4) ks8851_probe() checks unsigned value as negative, fix also from Dan Carpenter. 5) Fix crashes in atl1c driver due to TX queue handling, from Eric Dumazet. I anticipate some TX side locking fixes coming in the near future for this driver as well. 6) The inline directive fix in Bluetooth which was breaking the build only with very new versions of GCC, from Johan Hedberg. 7) Fix crashes in the ATP CLIP code due to ARP cleanups this merge window, reported by Meelis Roos and fixed by Eric Dumazet. 8) JME driver doesn't flush RX FIFO correctly, from Guo-Fu Tseng. 9) Some ip6_route_output() callers test the return value for NULL, but this never happens as the convention is to return a dst entry with dst->error set. Fixes from RonQing Li. 10) Logitech Harmony 900 should be handled by zaurus driver not cdc_ether, update white lists and black lists accordingly. From Scott Talbert. 11) Receiving from certain kinds of devices there won't be a MAC header, so there is no MAC header to fixup in the IPSEC code, and if we try to do it we'll crash. Fix from Eric Dumazet. 12) Port type array indexing off-by-one in mlx4 driver, fix from Yevgeny Petrilin. 13) Fix regression in link-down handling in davinci_emac which causes all RX descriptors to be freed up and therefore RX to wedge completely, from Christian Riesch. 14) It took two attempts, but ctnetlink soft lockups seem to be cured now, from Pablo Neira Ayuso. 15) Endianness bug fix in ENIC driver, from Santosh Nayak. 16) The long ago conversion of the PPP fragmentation code over to abstracted SKB list handling wasn't perfect, once we get an out of sequence SKB we don't flush the rest of them like we should. From Ben McKeegan. 17) Fix regression of ->ip_summed initialization in sfc driver. From Ben Hutchings. 18) Bluetooth timeout mistakenly using msecs instead of jiffies, from Andrzej Kaczmarek. 19) Using _sync variant of work cancellation results in deadlocks, use the non _sync variants instead. From Andre Guedes. 20) Bluetooth rfcomm code had reference counting problems leading to crashes, fix from Octavian Purdila. 21) The conversion of netem over to classful qdisc handling added two bugs to netem_dequeue(), fixes from Eric Dumazet. 22) Missing pci_iounmap() in ATM Solos driver. Fix from Julia Lawall. 23) b44_pci_exit() should not have __exit tag since it's invoked from non-__exit code. From Nikola Pajkovsky. 24) The conversion of the neighbour hash tables over to RCU added a race, fixed here by adding the necessary reread of tbl->nht, fix from Michel Machado. 25) When we added VF (virtual function) attributes for network device dumps, this potentially bloats up the size of the dump of one network device such that the dump size is too large for the buffer allocated by properly written netlink applications. In particular, if you add 255 VFs to a network device, parts of GLIBC stop working. To fix this, we add an attribute that is used to turn on these extended portions of the network device dump. Sophisticaed applications like 'ip' that want to see this stuff will be changed to set the attribute, whereas things like GLIBC that don't care about VFs simply will not, and therefore won't be busted by the mere presence of VFs on a network device. Thanks to the tireless work of Greg Rose on this fix. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits) sfc: Fix assignment of ip_summed for pre-allocated skbs ppp: fix 'ppp_mp_reconstruct bad seq' errors enic: Fix endianness bug. gre: fix spelling in comments netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2) Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries" davinci_emac: Do not free all rx dma descriptors during init mlx4_core: Fixing array indexes when setting port types phy: IC+101G and PHY_HAS_INTERRUPT flag netdev/phy/icplus: Correct broken phy_init code ipsec: be careful of non existing mac headers Move Logitech Harmony 900 from cdc_ether to zaurus hso: memsetting wrong data in hso_get_count() netfilter: ip6_route_output() never returns NULL. ethernet/broadcom: ip6_route_output() never returns NULL. ipv6: ip6_route_output() never returns NULL. jme: Fix FIFO flush issue atm: clip: remove clip_tbl ipv4: ping: Fix recvmsg MSG_OOB error handling. rtnetlink: Fix problem with buffer allocation ...	2012-02-26 12:47:17 -08:00
Eric Dumazet	6221217199	mlx4_en: dont change mac_header on xmit A driver xmit function is not allowed to change skb without special care. mlx4_en_xmit() should not call skb_reset_mac_header() and instead should use skb->data to access ethernet header. This removes a dumb test : if (ethh && ethh->h_dest) Also remove this slow mlx4_en_mac_to_u64() call, we can use get_unaligned() to get faster code. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-26 14:22:05 -05:00
Eli Cohen	a5bbe892da	mlx4: Enforce device max FMR maps in FMR alloc ConnectX devices have a limit on the number of mappings that can be done on an FMR before having to call sync_tpt. The current mlx4_ib driver reports the limit correctly in max_map_per_fmr in .query_device(), but mlx4_core doesn't check it when actually allocating FMRs. Add a max_fmr_maps field to struct mlx4_caps and enforce this maximum value on FMR allocations. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-02-26 01:43:37 -08:00
Linus Torvalds	b52b80023f	One InfiniBand/RDMA regression fix for 3.3: - mlx4 SR-IOV changes added static exported functions, which doesn't build on powerpc at least. Fix from Doug Ledford for this. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABCAAGBQJPSEe5AAoJEENa44ZhAt0hIoUP/jwpxojruN8c7PuJesp3sne2 45xxH0OsPSFdYTBnNZGtL3MWnzp88YjoggRvrKGmiEc/vitDTveL8/j+jFatl/d5 o5yBq9wsSKgU3jJiczT21WBOIg2I8KGgdWozNSO8rUl++fHJGgH1sCSmf8biomnk dlHZbA4ZszH1bxHh406GK/+cP5jjKlTLqkixz/156fsopMxzHBaJycOmzSPpHl9s ykrVv3n/mhrc3zBgx5y9aU+LhcchZH6CBQOzLBks1c4w8AFXTxIMAQRhkBVnDksi zZhg5E05zRkynr27zebNnu6Y6hfWamfCHkM6krJLXL/QRfN0LmoLvdtW2HRffrHW 9eOzEdmh9i5wHSeH5zhQAE17yttpWQCNN1thGv3oe2uGj7ItmtA1yc/5q8opwBU3 bl4/saNVdK8DGtpGZjCmPBnsOl9IgMmYMn5haRmOhDvD4/B93OHl+2iQi50vPJHS bBvYG03OTHYs9QexDANDv3Q9VmxGRCBXcZkQblAqofgO3dLgCQdddwn0VHt9xDfK 5Hp31vPs5pGfT6AkPHREHNNvj27Ve1erjrODNVI/Qr2CIkaoiQ/6oR+9ZMrAFsnb uyhNDzt4yuMrfA3T/7sAITL6hFLkVHYL2lSorFr5BXZ4cusYlzUYHRbtiR17Vsye 1KNTElxLI9fmD5xJSY6N =mHhD -----END PGP SIGNATURE----- Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband One InfiniBand/RDMA regression fix for 3.3: - mlx4 SR-IOV changes added static exported functions, which doesn't build on powerpc at least. Fix from Doug Ledford for this. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: mlx4_core: Exported functions can't be static	2012-02-24 20:03:14 -08:00
Yevgeny Petrilin	1e0f03d57d	mlx4_core: Fixing array indexes when setting port types Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-24 01:52:45 -05:00
Doug Ledford	22c8bff6fa	mlx4_core: Exported functions can't be static At least on powerpc, it breaks the build if exported functions are static. Fix some static exported functions introduced with the mlx4 SR-IOV support added in 3.3-rc1. Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2012-02-22 23:00:38 -08:00
Yevgeny Petrilin	3d8f93083b	mlx4: Setting new port types after all interfaces unregistered In port type change flow, need to set the new port types only after all interfaces have finished the unregister process. Otherwise, during unregister, one of the interfaces might issue a SET_PORT command with wrong port types, it can cause bad FW behavior. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-21 15:27:24 -05:00
Yevgeny Petrilin	730c41d5ba	mlx4: Replacing pool_lock with mutex Under the spinlock we call request_irq(), which allocates memory with GFP_KERNEL, This causes the following trace when DEBUG_SPINLOCK is enabled, it can cause the following trace: BUG: spinlock wrong CPU on CPU#2, ethtool/2595 lock: ffff8801f9cbc2b0, .magic: dead4ead, .owner: ethtool/2595, .owner_cpu: 0 Pid: 2595, comm: ethtool Not tainted 3.0.18 #2 Call Trace: spin_bug+0xa2/0xf0 do_raw_spin_unlock+0x71/0xa0 _raw_spin_unlock+0xe/0x10 mlx4_assign_eq+0x12b/0x190 [mlx4_core] mlx4_en_activate_cq+0x252/0x2d0 [mlx4_en] ? mlx4_en_activate_rx_rings+0x227/0x370 [mlx4_en] mlx4_en_start_port+0x189/0xb90 [mlx4_en] mlx4_en_set_ringparam+0x29a/0x340 [mlx4_en] dev_ethtool+0x816/0xb10 ? dev_get_by_name_rcu+0xa4/0xe0 dev_ioctl+0x2b5/0x470 handle_mm_fault+0x1cd/0x2d0 sock_do_ioctl+0x5d/0x70 sock_ioctl+0x79/0x2f0 do_vfs_ioctl+0x8c/0x340 sys_ioctl+0xa1/0xb0 system_call_fastpath+0x16/0x1b Replacing with mutex, which is enough in this case. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-21 15:27:23 -05:00
Jack Morgenstein	3d7474734b	mlx4_core: Do not map BF area if capability is 0 BF can be disabled in some cases, the capability field, bf_reg_size is set to zero in this case. Don't map the BF area in this case, it would cause failures. In addition, leaving the BF area unmapped also alerts the ETH driver to not use BF. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-20 19:26:34 -05:00
David S. Miller	32efe08d77	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c Small minor conflict in bnx2x, wherein one commit changed how statistics were stored in software, and another commit fixed endianness bugs wrt. reading the values provided by the chip in memory. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-19 16:03:15 -05:00
Eugenia Emantayev	9f5b6c632e	mlx4: add unicast steering entries to resource_tracker Add unicast steering entries to resource tracker. Do qp_detach also for these entries when VF doesn't shut down gracefully. Otherwise there is leakage of these resources, since they are not tracked. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-15 14:50:16 -05:00
Eugenia Emantayev	2531188b47	mlx4: fix QP tree trashing When adding new unicast steer entry, before moving qp to state ready, actually before calling mlx4_RST2INIT_QP_wrapper(), there were added a lot of entries with local_qpn=0 into radix tree. This fact impacted the get_res() function and proper functioning of resource tracker in addition to adding trash entries into radix tree. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@melllanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-15 14:50:16 -05:00
Eugenia Emantayev	75c6062cb7	mlx4: fix buffer overrun When passing MLX4_UC_STEER=1 it was translated to value 2 after mlx4_QP_ATTACH_wrapper. Therefore in new_steering_entry() unicast steer entries were added to index 2 of array of size 2. Fixing this bug by shift right to one position. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-15 14:50:15 -05:00
Eugenia Emantayev	eb40d89276	mlx4: add unicast steering entries to resource_tracker Add unicast steering entries to resource tracker. Do qp_detach also for these entries when VF doesn't shut down gracefully. Otherwise there is leakage of these resources, since they are not tracked. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-14 14:11:59 -05:00
Eugenia Emantayev	f1f75f0e2b	mlx4: attach multicast with correct flag mlx4_multicast_attach/detach() should use always MLX4_MC_STEER flag Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-14 14:11:58 -05:00
Eugenia Emantayev	de9b43dbb8	mlx4: remove redundant adding of steering type to gid mlx4_uc_steer_add/release() should not add MLX4_UC_STEER flag to gid. It is added in mlx4_unicast_attach/detach(). Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-14 14:11:58 -05:00
Eugenia Emantayev	deb8b3e849	mlx4: remove unnecessary variables and arguments mlx4_qp_attach/detach_common() don't use hash variable, move it to find_entry() static find_entry() in mcg.c doesn't use steer argument Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-14 14:11:58 -05:00
Eugenia Emantayev	45b5136551	mlx4: remove unused field high_prios Remove unnecessary field high_prios from mlx4_steer struct and initialization Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-14 14:11:58 -05:00
Eugenia Emantayev	0ee9f1dd7b	mlx4: fix QP tree trashing When adding new unicast steer entry, before moving qp to state ready, actually before calling mlx4_RST2INIT_QP_wrapper(), there were added a lot of entries with local_qpn=0 into radix tree. This fact impacted the get_res() function and proper functioning of resource tracker in addition to adding trash entries into radix tree. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@melllanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-14 14:11:58 -05:00
Eugenia Emantayev	58a30d6a3c	mlx4_core: fix buffer overrun When passing MLX4_UC_STEER=1 it was translated to value 2 after mlx4_QP_ATTACH_wrapper. Therefore in new_steering_entry() unicast steer entries were added to index 2 of array of size 2. Fixing this bug by shift right to one position. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-14 14:11:57 -05:00
Axel Lin	758ff235b3	mlx4: Fix kcalloc parameters swapped The first parameter should be "number of elements" and the second parameter should be "element size". Signed-off-by: Axel Lin <axel.lin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-13 16:00:58 -05:00
David S. Miller	d5ef8a4d87	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/infiniband/hw/nes/nes_cm.c Simple whitespace conflict. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-10 23:32:28 -05:00
Thadeu Lima de Souza Cascardo	7e2eb99cc6	mlx4: fix DMA mapping leak when allocation fails mlx4_en_prepare_rx_desc does not correctly clean up after it finds an allocation failure. It should unmap a page before calling put_page, but it only calls the later. This bug would prevent a device removal using hotplug after setting the device MTU to 9000 and opening the network interface. After the fix, we still see the allocation failure with MTU 9000, but we are able to remove the device. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-06 14:42:28 -05:00
Thadeu Lima de Souza Cascardo	68355f7113	mlx4: allow device removal by fixing dma unmap size After opening the network interface, Mellanox ConnectX device cannot be removed by hotplug because it has not properly unmapped all DMA memory. It happens that mlx4_en_activate_rx_rings overrides the variable that keeps the size of the memory mapped. This is fixed by passing to mlx4_en_destroy_rx_ring the same size that is given to mlx4_en_create_rx_ring. After applying this patch, hot unplugging the device works after opening the interface. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-06 14:42:28 -05:00
Eugenia Emantayev	4c41b36737	mlx4_core: use correct port for steering Use port number for correct steering (list per port). Before the fix all steering entries (for both physical ports) were managed in first port structures, so we had leakage of resources for port 2. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-06 12:10:11 -05:00
Eugenia Emantayev	4df9950406	mlx4_core: use correct flag for unicast_promisc Use MLX4_DEV_CAP_FLAG_VEP_UC_STEER for unicast_promisc_add/remove Unicast entries were managed in wrong data structures. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-06 12:10:11 -05:00
Eugenia Emantayev	f08ad06c05	mlx4_core: fix memory leak at multi_func_cleanup Perform cleanup also in non-master flow. The VFs use communication channel as well. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-06 12:10:11 -05:00
Pradeep A Dalvi	c056b734e5	netdev: ethernet dev_alloc_skb to netdev_alloc_skb Replaced deprecating dev_alloc_skb with netdev_alloc_skb in drivers/net/ethernet - Removed extra skb->dev = dev after netdev_alloc_skb Signed-off-by: Pradeep A Dalvi <netdev@pradeepdalvi.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-06 11:52:27 -05:00
Masanari Iida	8d9eb069ea	mlx4: Fix typo in cmd.c Correct spelling "reseting" to "resetting" in drivers/net/ethernet/mellanox/mlx4/cmd.c Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-04 16:31:47 -05:00
Joe Perches	41de8d4cff	drivers/net: Remove alloc_etherdev error messages alloc_etherdev has a generic OOM/unable to alloc message. Remove the duplicative messages after alloc_etherdev calls. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-31 16:20:48 -05:00
Joe Perches	e404decb0f	drivers/net: Remove unnecessary k.alloc/v.alloc OOM messages alloc failures use dump_stack so emitting an additional out-of-memory message is an unnecessary duplication. Remove the allocation failure messages. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-31 16:20:21 -05:00
Marcel Apfelbaum	803143fbda	mlx4_core: map async events to arbitrary slave eqs Slave async events were mapped to single eq. This patch fixes this issue, so the slaves can map the async events to any eq. Signed-off-by: Marcel Apfelbaum <marcela@dev.mellanox.co.il> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-22 15:08:44 -05:00

... 3 4 5 6 7 ...

528 Commits