linux

Author	SHA1	Message	Date
Jacob Keller	d8ec92f2cd	fm10k: fix a minor typo in some comments s/funciton/function to resolve a typo, and cleanup grammar on a few comments regarding processing the VF mailboxes. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:38 -07:00
Jacob Keller	4be37c42a4	fm10k: correctly clean up when init_queueing_scheme fails Fix a kernel panic that occurs during surprise removal. Clear the interface queue counts upon fm10k_init_msix_capability failure. This prevents further code (fm10k_update_stats etc.) from attempting to access unallocated queue vector or ring memory. [ 628.692648] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 [ 628.692805] IP: [<ffffffffa0475caf>] fm10k_update_stats+0x7f/0x2c0 [fm10k] [ 628.693173] PGD 0 [ 628.693759] Oops: 0000 [#1] SMP [ 628.699321] CPU: 10 PID: 8164 Comm: kworker/10:0 Tainted: G OE ------------ 3.10.0-327.el7.x86_64 #1 [ 628.700096] Hardware name: Supermicro X9DAi/X9DAi, BIOS 3.2 05/09/2015 [ 628.700894] Workqueue: pciehp-1 pciehp_power_thread [ 628.701686] task: ffff88086559c500 ti: ffff8808593c0000 task.ti: ffff8808593c0000 [ 628.702493] RIP: 0010:[<ffffffffa0475caf>] [<ffffffffa0475caf>] fm10k_update_stats+0x7f/0x2c0 [fm10k] [ 628.703310] RSP: 0018:ffff8808593c3b00 EFLAGS: 00010282 [ 628.704132] RAX: 0000000000000000 RBX: ffff880860760000 RCX: 0000000000000000 [ 628.704963] RDX: ffff880860760b08 RSI: 0000000000000000 RDI: 0000000000000000 [ 628.705794] RBP: ffff8808593c3b40 R08: 0000000000000000 R09: 0000000000000000 [ 628.706604] R10: 0000000000000000 R11: ffff880860760c40 R12: 0000000000000080 [ 628.707420] R13: ffff8808607608c0 R14: ffff880860779ec0 R15: ffff880860779f40 [ 628.708238] FS: 0000000000000000(0000) GS:ffff88086f000000(0000) knlGS:0000000000000000 [ 628.709071] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 628.709923] CR2: 0000000000000068 CR3: 000000000194a000 CR4: 00000000001407e0 [ 628.710752] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 628.711596] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 628.712438] Stack: [ 628.713255] ffff880860764458 ffff8808607608c0 ffff880860760000 ffff880860760000 [ 628.714088] 0000000000000080 ffff8808607608c0 ffff880860779ec0 ffff880860779f40 [ 628.714925] ffff8808593c3b88 ffffffffa04780c5 ffff880860764458 0000000a8163cb5b [ 628.715752] Call Trace: [ 628.716560] [<ffffffffa04780c5>] fm10k_down+0x155/0x1f0 [fm10k] [ 628.717367] [<ffffffffa0479958>] fm10k_close+0x28/0xd0 [fm10k] [ 628.718184] [<ffffffff81526365>] __dev_close_many+0x85/0xd0 [ 628.718986] [<ffffffff815264d8>] dev_close_many+0x98/0x120 [ 628.719764] [<ffffffff81527ab8>] rollback_registered_many+0xa8/0x230 [ 628.720527] [<ffffffff81527c80>] rollback_registered+0x40/0x70 [ 628.721294] [<ffffffff81529198>] unregister_netdevice_queue+0x48/0x80 [ 628.722052] [<ffffffff815291ec>] unregister_netdev+0x1c/0x30 [ 628.722816] [<ffffffffa04762b8>] fm10k_remove+0xd8/0xe0 [fm10k] [ 628.723581] [<ffffffff81328c7b>] pci_device_remove+0x3b/0xb0 [ 628.724340] [<ffffffff813f5fbf>] __device_release_driver+0x7f/0xf0 [ 628.725088] [<ffffffff813f6053>] device_release_driver+0x23/0x30 [ 628.725814] [<ffffffff81321fe4>] pci_stop_bus_device+0x94/0xa0 [ 628.726535] [<ffffffff813220d2>] pci_stop_and_remove_bus_device+0x12/0x20 [ 628.727249] [<ffffffff8133de40>] pciehp_unconfigure_device+0xb0/0x1b0 [ 628.727964] [<ffffffff8133d822>] pciehp_disable_slot+0x52/0xd0 [ 628.728664] [<ffffffff8133d98a>] pciehp_power_thread+0xea/0x150 [ 628.729358] [<ffffffff8109d5fb>] process_one_work+0x17b/0x470 [ 628.730036] [<ffffffff8109e3cb>] worker_thread+0x11b/0x400 [ 628.730730] [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [ 628.731385] [<ffffffff810a5aef>] kthread+0xcf/0xe0 [ 628.732036] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [ 628.732674] [<ffffffff81645858>] ret_from_fork+0x58/0x90 [ 628.733289] [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [ 628.733883] Code: 83 e8 01 48 8d 97 40 02 00 00 45 31 c0 4c 8d 9c c7 48 02 0 [ 628.735202] RIP [<ffffffffa0475caf>] fm10k_update_stats+0x7f/0x2c0 [fm10k] [ 628.735732] RSP <ffff8808593c3b00> [ 628.736285] CR2: 0000000000000068 [ 628.736846] ---[ end trace 9156088b311aff42 ]--- Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:34 -07:00
Bruce Allan	c4114e3db6	fm10k: prevent possibly uninitialized variable If 'attr_flag < (1 << (2 * FM10K_TEST_MSG_NESTED))' is ever false, err will be used uninitialized. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:31 -07:00
Jacob Keller	d2e0721b18	fm10k: add helper functions to set strings and data for ethtool stats Reduce duplicate code and the amount of indentation by adding fm10k_add_stat_strings and fm10k_add_ethtool_stats functions which help add fm10k_stat structures to the ethtool stats callbacks. This helps increase ease of use for future stat additions, and increases code readability. Skip handling of the per-queue stats as these will be reworked in a following patch. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:27 -07:00
Jacob Keller	c8ed563beb	fm10k: free MBX IRQ before clearing interrupt scheme During fm10k_io_error_detected we were clearing the interrupt scheme before we freed the MBX IRQ. This causes a kernel panic because the MBX IRQ are assigned after MSI-X initialization. Clearing the interrupt scheme results in removing the MSI-X entry table. Fix this by freeing the MBX IRQ before we clear the interrupt scheme, as we do elsewhere in the driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:22 -07:00
Jacob Keller	61e0217e83	fm10k: print error message when stop_hw fails fm10k_stop_hw_generic calls fm10k_disable_queues_generic, which may return an error code indicating that the queues were not stopped within the time limit. Notify the user by displaying a message in the kernel message ring, in a similar way to how we notify the user when reset_hw fails. There isn't much we can do to recover from this error, so currently nothing else is done. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:14 -07:00
Jacob Keller	b3525696ad	fm10k: base queue scheme covered by RSS In fm10k_set_num_queues, we previously assigned the base template. This would always be overwritten by either fm10k_set_qos_queues or fm10k_set_rss_queues. In either case, we don't need the base values, so we can just remove them. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:07 -07:00
Jacob Keller	e72319bba8	fm10k: don't initialize service task until later in probe Delay initialization of the service timer and service task until late probe. If we don't wait, failures in probe do not properly cleanup the service timer or service task items, which results in the kernel panic below, potentially freezing the whole system. In addition, ensure that the SERVICE_DISABLE bit is set before we request the MBX IRQ since the MBX interrupt attempts to schedule the service task otherwise. This prevents a similar trace from occurring after this change. We didn't notice this issue before because probe almost always completes successfully. I discovered it due to a mis-ordered mailbox handler array, which resulted in the following failure when requesting mailbox interrupt. [ 555.325619] ------------[ cut here ]------------ [ 555.325628] WARNING: CPU: 0 PID: 4941 at lib/list_debug.c:33 __list_add+0xa0/0xd0() [ 555.325631] list_add corruption. prev->next should be next (ffffffff81f46648), but was (null). (prev=ffff8807fad5d0e8). <snip> [ 555.325722] CPU: 0 PID: 4941 Comm: insmod Tainted: G OE 4.0.4-303.fc22.x86_64 #1 [ 555.325725] Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.03.8x23.060520140825 06/05/2014 [ 555.325727] 0000000000000000 00000000b4f161b3 ffff88081a21f8e8 ffffffff81783124 [ 555.325734] 0000000000000000 ffff88081a21f940 ffff88081a21f928 ffffffff8109c66a [ 555.325740] 0000000064000000 ffff8807fad5d0e8 ffff8807fad5d0e8 ffffffff81f46648 [ 555.325746] Call Trace: [ 555.325752] [<ffffffff81783124>] dump_stack+0x45/0x57 [ 555.325757] [<ffffffff8109c66a>] warn_slowpath_common+0x8a/0xc0 [ 555.325759] [<ffffffff8109c6f5>] warn_slowpath_fmt+0x55/0x70 [ 555.325763] [<ffffffff813ba270>] __list_add+0xa0/0xd0 [ 555.325768] [<ffffffff81102d1d>] __internal_add_timer+0x9d/0x110 [ 555.325771] [<ffffffff81102dbf>] internal_add_timer+0x2f/0xc0 [ 555.325774] [<ffffffff81104e5a>] mod_timer+0x12a/0x230 [ 555.325782] [<ffffffffa03d54ca>] fm10k_probe+0x69a/0xc80 [fm10k] [ 555.325787] [<ffffffff813e8355>] local_pci_probe+0x45/0xa0 [ 555.325791] [<ffffffff8129cf42>] ? sysfs_do_create_link_sd.isra.2+0x72/0xc0 [ 555.325794] [<ffffffff813e96b9>] pci_device_probe+0xf9/0x150 [ 555.325799] [<ffffffff814d7e73>] driver_probe_device+0xa3/0x400 [ 555.325802] [<ffffffff814d82ab>] __driver_attach+0x9b/0xa0 [ 555.325805] [<ffffffff814d8210>] ? __device_attach+0x40/0x40 [ 555.325808] [<ffffffff814d5bd3>] bus_for_each_dev+0x73/0xc0 [ 555.325811] [<ffffffff814d78ce>] driver_attach+0x1e/0x20 [ 555.325815] [<ffffffff814d7480>] bus_add_driver+0x180/0x250 [ 555.325819] [<ffffffffa03b2000>] ? 0xffffffffa03b2000 [ 555.325823] [<ffffffff814d8aa4>] driver_register+0x64/0xf0 [ 555.325826] [<ffffffff813e7bec>] __pci_register_driver+0x4c/0x50 [ 555.325832] [<ffffffffa03d6ca3>] fm10k_register_pci_driver+0x23/0x30 [fm10k] [ 555.325838] [<ffffffffa03b2080>] fm10k_init_module+0x80/0x1000 [fm10k] [ 555.325843] [<ffffffff81002128>] do_one_initcall+0xb8/0x200 [ 555.325848] [<ffffffff811e10d2>] ? __vunmap+0xa2/0x100 [ 555.325852] [<ffffffff811fe239>] ? kmem_cache_alloc_trace+0x1b9/0x240 [ 555.325855] [<ffffffff8178230e>] ? do_init_module+0x28/0x1cb [ 555.325858] [<ffffffff81782346>] do_init_module+0x60/0x1cb [ 555.325862] [<ffffffff8112168e>] load_module+0x205e/0x26b0 [ 555.325866] [<ffffffff8111d110>] ? store_uevent+0x70/0x70 [ 555.325870] [<ffffffff812234b0>] ? kernel_read+0x50/0x80 [ 555.325873] [<ffffffff81121f3e>] SyS_finit_module+0xbe/0xf0 [ 555.325878] [<ffffffff81789749>] system_call_fastpath+0x12/0x17 [ 555.325880] ---[ end trace 9e0f58d071eafd2a ]--- Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:49:02 -07:00
Jacob Keller	de66c610a6	fm10k: prevent null pointer dereference of msix_entries table According to the C standard dereferencing a variable before it is checked invokes undefined behavior, and thus compilers are free to assume the check for NULL isn't necessary. Prevent this by re-ordering the NULL check of msix_entries in fm10k_free_mbx_irq. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:48:55 -07:00
Bruce Allan	11c49f79b2	fm10k: use ether_addr_copy to copy MAC address Cleanup the remaining instances of using memcpy() instead of the preferred ether_addr_copy(). Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:44:49 -07:00
Bruce Allan	1905add427	fm10k: cleanup SPACE_BEFORE_TAB checkpatch warning Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:44:44 -07:00
Bruce Allan	838e610292	fm10k: demote BUG_ON() to WARN_ON() where appropriate We don't need to crash the kernel in this instance so just warn about the condition and play on. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:44:36 -07:00
Bruce Allan	fcdb0a9951	fm10k: cleanup remaining right-bit-shifted 1 Use BIT() macro instead. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:44:30 -07:00
Bruce Allan	1aab144c50	fm10k: Move constants to the right of binary operators The semantic patch that makes this change is available in scripts/coccinelle/misc/compare_const_fl.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/ Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Krishneil Singh <Krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-05 12:39:55 -07:00
David S. Miller	265bee7290	Merge branch 'mlxsw-next' Jiri Pirko says: ==================== mlxsw: small driver update, including switchdev doc update Ido Schimmel (3): mlxsw: spectrum: Reduce number of supported 802.1D bridges switchdev: Use switch ID in suggested udev rule mlxsw: spectrum: Add support for physical port names ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-05 15:07:55 -04:00
Ido Schimmel	2bf9a58675	mlxsw: spectrum: Add support for physical port names Export to userspace the front panel name of the port, so that udev can rename the ports accordingly. The convention suggested by switchdev documentation is used: 1) Non-split: pX 2) Split: pXsY Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-05 15:07:54 -04:00
Ido Schimmel	75f3a1018f	switchdev: Use switch ID in suggested udev rule Since there can be multiple switch ASICs on the same system we should use the switch ID in order to differentiate between them and set the switch name (e.g. swX) accordingly. Also, replace the order of the "Switch ID" and "Port Netdev Naming" sections following the above change. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-05 15:07:54 -04:00
Ido Schimmel	b555cf4a50	mlxsw: spectrum: Reduce number of supported 802.1D bridges Resources allocated for these bridges at init time cannot be later used for other purposes. While current number is supported by the device, it's mostly theoretical with regards to any real use case, which leads to poor utilization of device's resources. Solve that by reducing the number. The long term plan is to make this value (along with others) user configurable via devlink and write it to NVRAM, so that it can be used during the next init. Until then we must hardcode such values. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-05 15:07:54 -04:00
David S. Miller	15f41e2ba1	Merge branch 'tcp-udp-misc' Eric Dumazet says: ==================== net: various udp/tcp changes First round of patches for linux-4.7 Add a generic facility for sockets to be freed after an RCU grace period, if they need to. Then UDP stack is changed to no longer use SLAB_DESTROY_BY_RCU, in order to speedup rx processing for traffic encapsulated in UDP. It gives a 17 % speedup for normal UDP reception in stress conditions. Then TCP listeners are changed to use SOCK_RCU_FREE as well to avoid touching sk_refcnt in synflood case : I got up to 30 % performance increase for a mono listener. Then three patches add SK_MEMINFO_DROPS to sock_diag and add per socket rx drops accounting to TCP. Last patch adds rate limiting on ACK sent on behalf of SYN_RECV to better resist to SYNFLOOD targeting one or few flows. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:21 -04:00
Eric Dumazet	4ce7e93cb3	tcp: rate limit ACK sent by SYN_RECV request sockets Attackers like to use SYNFLOOD targeting one 5-tuple, as they hit a single RX queue (and cpu) on the victim. If they use random sequence numbers in their SYN, we detect they do not match the expected window and send back an ACK. This patch adds a rate limitation, so that the effect of such attacks is limited to ingress only. We roughly double our ability to absorb such attacks. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Maciej Żenczykowski <maze@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:20 -04:00
Eric Dumazet	a9d6532b56	ipv4: tcp: set SOCK_USE_WRITE_QUEUE for ip_send_unicast_reply() TCP uses per cpu 'sockets' to send some packets : - RST packets ( tcp_v4_send_reset()) ) - ACK packets for SYN_RECV and TIMEWAIT sockets By setting SOCK_USE_WRITE_QUEUE flag, we tell sock_wfree() to not call sk_write_space() since these internal sockets do not care. This gives a small performance improvement, merely by allowing cpu to properly predict the sock_wfree() conditional branch, and avoiding one atomic operation. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:20 -04:00
Eric Dumazet	9caad86415	tcp: increment sk_drops for listeners Goal: packets dropped by a listener are accounted for. This adds tcp_listendrop() helper, and clears sk_drops in sk_clone_lock() so that children do not inherit their parent drop count. Note that we no longer increment LINUX_MIB_LISTENDROPS counter when sending a SYNCOOKIE, since the SYN packet generated a SYNACK. We already have a separate LINUX_MIB_SYNCOOKIESSENT Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:20 -04:00
Eric Dumazet	532182cd61	tcp: increment sk_drops for dropped rx packets Now ss can report sk_drops, we can instruct TCP to increment this per socket counter when it drops an incoming frame, to refine monitoring and debugging. Following patch takes care of listeners drops. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:20 -04:00
Eric Dumazet	15239302ed	sock_diag: add SK_MEMINFO_DROPS Reporting sk_drops to user space was available for UDP sockets using /proc interface. Add this to sock_diag, so that we can have the same information available to ss users, and we'll be able to add sk_drops indications for TCP sockets as well. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:20 -04:00
Eric Dumazet	3b24d854cb	tcp/dccp: do not touch listener sk_refcnt under synflood When a SYNFLOOD targets a non SO_REUSEPORT listener, multiple cpus contend on sk->sk_refcnt and sk->sk_wmem_alloc changes. By letting listeners use SOCK_RCU_FREE infrastructure, we can relax TCP_LISTEN lookup rules and avoid touching sk_refcnt Note that we still use SLAB_DESTROY_BY_RCU rules for other sockets, only listeners are impacted by this change. Peak performance under SYNFLOOD is increased by ~33% : On my test machine, I could process 3.2 Mpps instead of 2.4 Mpps Most consuming functions are now skb_set_owner_w() and sock_wfree() contending on sk->sk_wmem_alloc when cooking SYNACK and freeing them. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:20 -04:00
Eric Dumazet	3a5d1c0e7c	inet: reqsk_alloc() needs to take care of dead listeners We'll soon no longer take a refcount on listeners, so reqsk_alloc() can not assume a listener refcount is not zero. We need to use atomic_inc_not_zero() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:19 -04:00
Eric Dumazet	2d331915a0	tcp/dccp: use rcu locking in inet_diag_find_one_icsk() RX packet processing holds rcu_read_lock(), so we can remove pairs of rcu_read_lock()/rcu_read_unlock() in lookup functions if inet_diag also holds rcu before calling them. This is needed anyway as __inet_lookup_listener() and inet6_lookup_listener() will soon no longer increment refcount on the found listener. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:19 -04:00
Eric Dumazet	ee3cf32a4a	tcp/dccp: remove BH disable/enable in lookup Since linux 2.6.29, lookups only use rcu locking. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:19 -04:00
Eric Dumazet	ca065d0cf8	udp: no longer use SLAB_DESTROY_BY_RCU Tom Herbert would like not touching UDP socket refcnt for encapsulated traffic. For this to happen, we need to use normal RCU rules, with a grace period before freeing a socket. UDP sockets are not short lived in the high usage case, so the added cost of call_rcu() should not be a concern. This actually removes a lot of complexity in UDP stack. Multicast receives no longer need to hold a bucket spinlock. Note that ip early demux still needs to take a reference on the socket. Same remark for functions used by xt_socket and xt_PROXY netfilter modules, but this might be changed later. Performance for a single UDP socket receiving flood traffic from many RX queues/cpus. Simple udp_rx using simple recvfrom() loop : 438 kpps instead of 374 kpps : 17 % increase of the peak rate. v2: Addressed Willem de Bruijn feedback in multicast handling - keep early demux break in __udp4_lib_demux_lookup() Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Cc: Willem de Bruijn <willemb@google.com> Tested-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:19 -04:00
Eric Dumazet	a4298e4522	net: add SOCK_RCU_FREE socket flag We want a generic way to insert an RCU grace period before socket freeing for cases where RCU_SLAB_DESTROY_BY_RCU is adding too much overhead. SLAB_DESTROY_BY_RCU strict rules force us to take a reference on the socket sk_refcnt, and it is a performance problem for UDP encapsulation, or TCP synflood behavior, as many CPUs might attempt the atomic operations on a shared sk_refcnt UDP sockets and TCP listeners can set SOCK_RCU_FREE so that their lookup can use traditional RCU rules, without refcount changes. They can set the flag only once hashed and visible by other cpus. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Tested-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:11:19 -04:00
David S. Miller	43e2dfb23e	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2016-04-04 This series contains updates to ixgbe and ixgbevf. Pavel Tikhomirov fixes a typo where we were incrementing transmit stats instead of receive stats on the receive side. Emil updates the ixgbevf driver to use bit operations for setting and checking the adapter state. Chas Williams adds the new NDO trust feature check so that the VF guest has the ability to set the unicast address of the interface, if it is a trusted VF. Alex cleans up the driver to that the only time we add a PF entry to the VLVF is either for VLAN 0 or if the PF has requested a VLAN that a VF is already using. Also adds support for generic transmit checksums, giving the added advantage is that we can support inner checksum offloads for tunnels and MPLS while still being able to transparently insert VLAN tags. Lastly, changed ixgbe so that we can use the ethtool rx-vlan-filter flag to toggle receive VLAN filtering on and off. Mark cleans up the ixgbe driver by making all op structures that do not change constants. Also fixed flow control for Xeon D KR backplanes, since we cannot use auto-negotiation to determine the mode, we have to use whatever the user configured. Sowmini Varadhan updates ixgbe to use eth_platform_get_mac_address() instead of the arch specific solution that was added by a previous commit. Don fixed an issue where it was possible that a system reset could occur when we were holding the SWFW semaphore lock, which the next time the driver loaded would see it incorrectly as locked. v2: updated patch 8 of the series to include a minor flags issue where we had lost NETIF_F_HW_TC and we were setting NETIF_F_SCTP_CRC in two different areas, when we only needed/wanted it in one spot. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 22:01:44 -04:00
David S. Miller	6e3380488a	Merge branch 'mv88e6131-hw-bridging-6185' Vivien Didelot says: ==================== net: dsa: mv88e6131: HW bridging support for 6185 All packets passing through a switch of the 6185 family are currently all directed to the CPU port. This means that port bridging is software driven. To enable hardware bridging for this switch family, we need to implement the port mapping operations, the FDB operations, and optionally the VLAN operations (for 802.1Q and VLAN filtering aware systems). However this family only has 256 FDBs indexed by 8-bit identifiers, opposed to 4096 FDBs with 12-bit identifiers for other families such as 6352. It also doesn't have dedicated FID registers for ATU and VTU operations. This patchset fixes these differences, and enable hardware bridging for 6185. Changes v1 -> v2: - Describe the different numbers of databases and prefer a feature-based logic over the current ID/family-based logic. ==================== Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 21:31:35 -04:00
Vivien Didelot	26892ffc80	net: dsa: mv88e6131: enable hardware bridging By adding support for bridge operations, FDB operations, and optionally VLAN operations (for 802.1Q and VLAN filtering aware systems), the switch bridges ports correctly, the CPU is able to populate the hardware address databases, and thus hardware bridging becomes functional within the 88E6185 family of switches. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 21:31:35 -04:00
Vivien Didelot	f93dd042de	net: dsa: mv88e6xxx: map destination addresses for 6185 The 88E6185 switch also has a MapDA bit in its Port Control 2 register. When this bit is cleared, all frames are sent out to the CPU port. Set this bit to rely on address databases (ATU) hits and direct frames out of the correct ports, and thus allow hardware bridging. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 21:31:35 -04:00
Vivien Didelot	11ea809f1a	net: dsa: mv88e6xxx: support 256 databases The 6185 family of devices has only 256 address databases. Their 8-bit FID for ATU and VTU operations are split into ATU Control and ATU/VTU Operation registers. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 21:31:35 -04:00
Vivien Didelot	f74df0be82	net: dsa: mv88e6xxx: variable number of databases Marvell switch chips have different number of address databases. The code currently only supports models with 4096 databases. Such switch has dedicated FID registers for ATU and VTU operations. Models with fewer databases have their FID split in several registers. List them all but only support models with 4096 databases at the moment. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 21:31:35 -04:00
Vivien Didelot	b426e5f7fe	net: dsa: mv88e6xxx: protect FID registers access Only switch families with 4096 address databases have dedicated FID registers for ATU and VTU operations. Factorize the access to the GLOBAL_ATU_FID register and introduce a mv88e6xxx_has_fid_reg() helper function to protect the access to GLOBAL_ATU_FID and GLOBAL_VTU_FID. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 21:31:35 -04:00
Vivien Didelot	2e7bd5ef98	net: dsa: mv88e6xxx: protect SID register access Introduce a mv88e6xxx_has_stu() helper to protect the access to the GLOBAL_VTU_SID register, instead of checking switch families. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 21:31:34 -04:00
Alexander Duyck	0c5a616650	ixgbe: Add support for toggling VLAN filtering flag via ethtool This change makes it so that we can use the ethtool rx-vlan-filter flag to toggle Rx VLAN filtering on and off. This is basically just an extension of the existing VLAN promisc work in that it just adds support for the additional ethtool flag. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-04 17:45:49 -07:00
Amritha Nambiar	4ae7834221	ixgbe: Extend cls_u32 offload to support UDP headers Added support to match on UDP fields in the transport layer. Extended core logic to support multiple headers. Verified with the following filters : handle 1: u32 divisor 1 u32 ht 800: order 1 link 1: \ offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 6 ff u32 ht 1: order 2 \ match tcp src 1024 ffff match tcp dst 23 ffff action drop handle 2: u32 divisor 1 u32 ht 800: order 3 link 2: \ offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 17 ff u32 ht 2: order 4 \ match udp src 1025 ffff match udp dst 24 ffff action drop Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-04 17:44:56 -07:00
Don Skidmore	dbd15b8f9c	ixgbe: Place SWFW semaphore in known valid state at probe It is possible on some HW that a system reset could occur when we are holding the SWFW semaphore lock. So next time the driver was loaded we would see it incorrectly as locked. This patch will recover from that state by: Attempting to acquire the semaphore and then regardless of whether or not it was acquire we immediately release it. This will force us into a known good state. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-04 17:44:50 -07:00
Rostislav Pehlivanov	c04f90e592	ixgbe: add a callback to set the maximum transmit bitrate This commit adds a callback which allows to adjust the maximum transmit bitrate the card can output. This makes it possible to get a smooth traffic instead of the default burst-y behaviour when trying to output e.g. a video stream. Much of the logic needed to get a correct bcnrc_val was taken from the ixgbe_set_vf_rate_limit() function. Signed-off-by: Rostislav Pehlivanov <atomnuker@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-04 17:44:46 -07:00
Mark Rustad	afdc71e4d6	ixgbe: Fix flow control for Xeon D KR backplane Xeon D KR backplane is different from other backplanes, in that we can't use auto-negotiation to determine the mode. Instead, use whatever the user configured. Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-04 17:44:31 -07:00
Alexander Duyck	cb2b3edbec	ixgbevf: Add support for generic Tx checksums This patch adds support for generic Tx checksums to the ixgbevf driver. It turns out this is actually pretty easy after going over the datasheet as we were doing a number of steps we didn't need to. In order to perform a Tx checksum for an L4 header we need to fill in the following fields in the Tx descriptor: MACLEN (maximum of 127), retrieved from: skb_network_offset() IPLEN (maximum of 511), retrieved from: skb_checksum_start_offset() - skb_network_offset() TUCMD.L4T indicates offset and if checksum or crc32c, based on: skb->csum_offset The added advantage to doing this is that we can support inner checksum offloads for tunnels and MPLS while still being able to transparently insert VLAN tags. I also took the opportunity to clean-up many of the feature flag configuration bits to make them a bit more consistent between drivers. In the case of the VF drivers this meant adding support for SCTP CRCs, and inner checksum offloads for MPLS and various tunnel types. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-04 17:44:22 -07:00
Alexander Duyck	49763de042	ixgbe: Add support for generic Tx checksums This patch adds support for generic Tx checksums to the ixgbe driver. It turns out this is actually pretty easy after going over the datasheet as we were doing a number of steps we didn't need to. In order to perform a Tx checksum for an L4 header we need to fill in the following fields in the Tx descriptor: MACLEN (maximum of 127), retrieved from: skb_network_offset() IPLEN (maximum of 511), retrieved from: skb_checksum_start_offset() - skb_network_offset() TUCMD.L4T indicates offset and if checksum or crc32c, based on: skb->csum_offset The added advantage to doing this is that we can support inner checksum offloads for tunnels and MPLS while still being able to transparently insert VLAN tags. I also took the opportunity to clean-up many of the feature flag configuration bits to make them a bit more consistent between drivers. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-04 17:39:05 -07:00
Sowmini Varadhan	c7374b5a76	ixgbe: use eth_platform_get_mac_address() This commit converts commit `c762dff24c` ("ixgbe: Look up MAC address in Open Firmware or IDPROM") to use eth_platform_get_mac_address() added by commit `c7f5d10549` ("net: Add eth_platform_get_mac_address() helper.") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-04 13:48:00 -07:00
Mark Rustad	37689010da	ixgbe: Make all unchanging ops structures const The source for the ops structure contents are const, so make them so. Copy them in place with structure assignments instead of memcpys. Make the mbx_ops accessed by reference instead of making a copy of the source structure. Update copyright date on the touched files. Reported-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-04 13:36:58 -07:00
Alexander Duyck	06bb1c39d8	ixgbe: Avoid adding VLAN 0 twice to VLVF and VFTA We were adding VLAN 0 twice each time we restored the VLAN configuration. Instead of doing it twice we can just start working through the active VLANs from ID 1 on and skip the double write. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2016-04-04 13:33:10 -07:00
Simon Horman	9ef280c6c2	irda: sh_irda: remove driver Remove the sh-irda driver as it appears to be unused since `c0bb9b3027` ("ARCH: ARM: shmobile: Remove ag5evm board support"). Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 16:24:13 -04:00
David S. Miller	acf195a9f5	Merge branch 'macb-coding-style' Moritz Fischer says: ==================== macb: Codingstyle cleanups resending almost unchanged v2 here: Changes from v2: * Rebased onto net-next * Changed 5th patches commit message * Added Nicholas' and Michal's Acked-Bys Changes from v1: * Backed out variable scope changes * Separated out ether_addr_copy into it's own commit * Fixed typo in comments as suggested by Joe ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-04-04 16:16:36 -04:00

1 2 3 4 5 ...

588753 Commits