linux

Author	SHA1	Message	Date
Ying Xue	8a0f6ebe84	tipc: involve reference counter for node structure TIPC node hash node table is protected with rcu lock on read side. tipc_node_find() is used to look for a node object with node address through iterating the hash node table. As the entire process of what tipc_node_find() traverses the table is guarded with rcu read lock, it's safe for us. However, when callers use the node object returned by tipc_node_find(), there is no rcu read lock applied. Therefore, this is absolutely unsafe for callers of tipc_node_find(). Now we introduce a reference counter for node structure. Before tipc_node_find() returns node object to its caller, it first increases the reference counter. Accordingly, after its caller used it up, it decreases the counter again. This can prevent a node being used by one thread from being freed by another thread. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:40:28 -07:00
Ying Xue	b952b2befb	tipc: fix potential deadlock when all links are reset [ 60.988363] ====================================================== [ 60.988754] [ INFO: possible circular locking dependency detected ] [ 60.989152] 3.19.0+ #194 Not tainted [ 60.989377] ------------------------------------------------------- [ 60.989781] swapper/3/0 is trying to acquire lock: [ 60.990079] (&(&n_ptr->lock)->rlock){+.-...}, at: [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.990743] [ 60.990743] but task is already holding lock: [ 60.991106] (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.991738] [ 60.991738] which lock already depends on the new lock. [ 60.991738] [ 60.992174] [ 60.992174] the existing dependency chain (in reverse order) is: [ 60.992174] -> #1 (&(&bclink->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] [<ffffffffa0000f57>] tipc_bclink_add_node+0x97/0xf0 [tipc] [ 60.992174] [<ffffffffa0011815>] tipc_node_link_up+0xf5/0x110 [tipc] [ 60.992174] [<ffffffffa0007783>] link_state_event+0x2b3/0x4f0 [tipc] [ 60.992174] [<ffffffffa00193c0>] tipc_link_proto_rcv+0x24c/0x418 [tipc] [ 60.992174] [<ffffffffa0008857>] tipc_rcv+0x827/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] -> #0 (&(&n_ptr->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a8f7d>] __lock_acquire+0x163d/0x1ca0 [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.992174] [<ffffffffa0001e11>] tipc_bclink_rcv+0x611/0x640 [tipc] [ 60.992174] [<ffffffffa0008646>] tipc_rcv+0x616/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] [ 60.992174] other info that might help us debug this: [ 60.992174] [ 60.992174] Possible unsafe locking scenario: [ 60.992174] [ 60.992174] CPU0 CPU1 [ 60.992174] ---- ---- [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] [ 60.992174] * DEADLOCK * [ 60.992174] [ 60.992174] 3 locks held by swapper/3/0: [ 60.992174] #0: (rcu_read_lock){......}, at: [<ffffffff81646791>] __netif_receive_skb_core+0x71/0x980 [ 60.992174] #1: (rcu_read_lock){......}, at: [<ffffffffa0002c35>] tipc_l2_rcv_msg+0x5/0xd0 [tipc] [ 60.992174] #2: (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] The correct the sequence of grabbing n_ptr->lock and bclink->lock should be that the former is first held and the latter is then taken, which exactly happened on CPU1. But especially when the retransmission of broadcast link is failed, bclink->lock is first held in tipc_bclink_rcv(), and n_ptr->lock is taken in link_retransmit_failure() called by tipc_link_retransmit() subsequently, which is demonstrated on CPU0. As a result, deadlock occurs. If the order of holding the two locks happening on CPU0 is reversed, the deadlock risk will be relieved. Therefore, the node lock taken in link_retransmit_failure() originally is moved to tipc_bclink_rcv() so that it's obtained before bclink lock. But the precondition of the adjustment of node lock is that responding to bclink reset event must be moved from tipc_bclink_unlock() to tipc_node_unlock(). Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:40:27 -07:00
Li RongQing	faadb05f4b	virtio: simplify the using of received in virtnet_poll received is 0, no need to minus it and use "+=" to reassign it Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:37:17 -07:00
David S. Miller	3556eaaab5	Merge branch 'be2net-next' Sathya Perla says: ==================== be2net: patch set Hi David, this patch set includes 2 feature additions to the be2net driver: Patch 1 sets up cpu affinity hints for be2net irqs using the cpumask_set_cpu_local_first() API that first picks the near numa cores and when they are exhausted, selects the far numa cores. Patch 2 setups up xps queue mapping for be2net's TXQs to avoid, by default, TX lock contention. Patch 3 just bumps up the driver version. Pls consider applying this patch set to the net-next queue. Thanks! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:34:05 -07:00
Sathya Perla	265ec92758	be2net: bump up the driver version to 10.6.0.1 Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:34:01 -07:00
Sathya Perla	73f394e629	be2net: setup xps queue mapping This patch sets up xps queue mapping on load, so that TX traffic is steered to the queue whose irqs are being processed by the current cpu. This helps in avoiding TX lock contention. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:34:01 -07:00
Padmanabh Ratnakar	d658d98aa5	be2net: assign CPU affinity hints to be2net IRQs This patch provides hints to irqbalance to map be2net IRQs to specific CPU cores. cpumask_set_cpu_local_first() is used, which first maps IRQs to near NUMA cores; when those cores are exhausted, IRQs are mapped to far NUMA cores. Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:34:00 -07:00
Eric Dumazet	41d25fe092	tcp: tcp_syn_flood_action() can be static After commit `1fb6f159fd` ("tcp: add tcp_conn_request"), tcp_syn_flood_action() is no longer used from IPv6. We can make it static, by moving it above tcp_conn_request() Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:17:18 -07:00
Wu Fengguang	1fb7cd4ef0	cxgb4: fix boolreturn.cocci warnings drivers/net/ethernet/chelsio/cxgb4/cxgb4_fcoe.c:49:9-10: WARNING: return of 0/1 in function 'cxgb_fcoe_sof_eof_supported' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci CC: Varun Prakash <varun@chelsio.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:15:56 -07:00
WANG Cong	85b9909272	fib6: install fib6 ops in the last step We should not commit the new ops until we finish all the setup, otherwise we have to NULL it on failure. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:12:37 -07:00
David S. Miller	7145074b37	Merge branch 'bcmgenet-next' Petri Gynther says: ==================== net: bcmgenet: multiple Rx queues support Final patch set to add support for multiple Rx queues: 1. remove priv->int0_mask and priv->int1_mask 2. modify Tx ring int_enable and int_disable vectors 3. simplify bcmgenet_init_dma() 4. tweak init_umac() 5. rework Tx NAPI code 6. rework Rx NAPI code 7. add support for multiple Rx queues ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:21 -07:00
Petri Gynther	4055eaefb3	net: bcmgenet: add support for multiple Rx queues Add support for multiple Rx queues: 1. Add NAPI context per Rx queue 2. Modify Rx interrupt and Rx NAPI code to handle multiple Rx queues Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:16 -07:00
Petri Gynther	3ab113399b	net: bcmgenet: rework Rx NAPI code Introduce new bcmgenet functions to handle the NAPI calls to: netif_napi_add() napi_enable() napi_disable() netif_napi_del() Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:15 -07:00
Petri Gynther	e2aadb4aa9	net: bcmgenet: rework Tx NAPI code Introduce new bcmgenet functions to handle the NAPI calls to: netif_napi_add() napi_enable() napi_disable() netif_napi_del() Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:14 -07:00
Petri Gynther	b2e97eca88	net: bcmgenet: tweak init_umac() Use more meaningful variable names int0_enable and int1_enable when enabling bcmgenet interrupts. For Rx default queue interrupts, use: UMAC_IRQ_RXDMA_BDONE \| UMAC_IRQ_RXDMA_PDONE For Tx default queue interrupts, use: UMAC_IRQ_TXDMA_BDONE \| UMAC_IRQ_TXDMA_PDONE Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:14 -07:00
Petri Gynther	ebbd96fb28	net: bcmgenet: simplify bcmgenet_init_dma() Do the two kcalloc() calls first, before proceeding into Rx/Tx DMA init. Makes the error case handling much simpler. Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Jaedon Shin <jaedon.shin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:13 -07:00
Petri Gynther	9dbac28fc1	net: bcmgenet: modify Tx ring int_enable and int_disable vectors Remove unnecessary function parameter priv. Use ring->priv instead. Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:13 -07:00
Petri Gynther	e412b1045c	net: bcmgenet: remove priv->int0_mask and priv->int1_mask Remove unused priv->int0_mask and priv->int1_mask. Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:12 -07:00
David S. Miller	19cc2dec8a	Merge branch 'xgene-next' Iyappan Subramanian says: ==================== drivers: net: xgene: Add separate tx completion ring SGMII based 1GbE and 10GbE interfaces support multiple interrupts. Adding separate tx completion descriptor ring and associating a dedicated irq for the TX completion. ==================== Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com>	2015-03-27 14:18:52 -07:00
Iyappan Subramanian	6772b653f8	drivers: net: xgene: Add separate tx completion ring - Added wrapper functions around napi_add, napi_del, napi_enable and napi_disable - Moved platform_get_irq function call after reading phy_mode - Associating the new irq to tx completion for the supported ethernet interfaces Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:18:48 -07:00
Iyappan Subramanian	d31346494b	dtb: xgene: Add interrupt for Tx completion Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:18:48 -07:00
Iyappan Subramanian	7e7d638ab1	Documentation: dts: xgene: Update interrupt field description Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:18:48 -07:00
Catherine Sullivan	3182b25e17	i40e: Bump version to 1.2.43 Bump. Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-27 03:01:29 -07:00
Mitch Williams	cc470a811b	i40evf: add FW version to ethtool info Customers reported that the firmware version information from i40evf is unlike that of ixgbevf and was causing problems with their scripts. To resolve this, populate the field to align with ixgbevf. Change-ID: I9f4e24f6a76cd819bbe07087aab2b74f715ec103 Reported-by: Sourav Chatterjee <sourav.chatterjee@intel.com> Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-27 03:01:29 -07:00
Akeem G Abodunrin	8af580dff2	i40e: Increase PF reset max loop limit In some circumstances the firmware can take longer to be ready after a reset than we're currently waiting. This patch increases the polling loop limiter to much longer than expected. Change-ID: I4b2c4c100dfa4abb77d02e5ecdec1cdd253b8c7e Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-27 02:58:24 -07:00
Mitch Williams	63e18c2520	i40evf: resequence close operations Call the netdev carrier off and TX disable functions first, before other shutdown operations. This stops the stack from hitting us with transmits while we're shutting down. Additionally, disable NAPI before disabling interrupts, or the interrupt might get re-enabled inappropriately. Finally, remove the call to netif_tx_stop_all_queues, as it is redundant - the call to netif_tx_disable already did the same thing. Change-ID: I8b2dd25231b82817746cc256234a5eeeb4abaccc Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-27 02:57:43 -07:00
Mitch Williams	e284fc88df	i40evf: delay releasing rings When the VF interface is closed, we cannot immediately free our rings and RX buffers, because the hardware hasn't yet stopped accessing this memory. This shows up as a panic or memory corruption when the device is brought down while under heavy stress. To fix this, delay releasing resources until we receive acknowledgment from the PF driver that the rings have indeed been stopped. Because of this delay, we also need to check to make sure that all of our admin queue requests have been handled before allowing the device to be opened. Change-ID: I44edd35529ce2fa2a9512437a3a8e6f14ed8ed63 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-27 02:57:03 -07:00
Jesse Brandeburg	ae24b4095c	i40e/i40evf: implement KR2 support The new devices need a new device ID some other defines to handle the new 20G speed for KR2. Change-ID: I03f717e364afe59657e8c9ce5ffaad856b4b21df Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Tested-by: Jim Young <james.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-03-27 00:12:09 -07:00
Patrick McHardy	cc02e457bb	netfilter: nf_tables: implement set transaction support Set elements are the last object type not supporting transaction support. Implement similar to the existing rule transactions: The global transaction counter keeps track of two generations, current and next. Each element contains a bitmask specifying in which generations it is inactive. New elements start out as inactive in the current generation and active in the next. On commit, the previous next generation becomes the current generation and the element becomes active. The bitmask is then cleared to indicate that the element is active in all future generations. If the transaction is aborted, the element is removed from the set before it becomes active. When removing an element, it gets marked as inactive in the next generation. On commit the next generation becomes active and the therefor the element inactive. It is then taken out of then set and released. On abort, the element is marked as active for the next generation again. Lookups ignore elements not active in the current generation. The current set types (hash/rbtree) both use a field in the extension area to store the generation mask. This (currently) does not require any additional memory since we have some free space in there. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:35 +01:00
Patrick McHardy	ea4bd995b0	netfilter: nf_tables: add transaction helper functions Add some helper functions for building the genmask as preparation for set transactions. Also add a little documentation how this stuff actually works. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:35 +01:00
Patrick McHardy	b2832dd662	netfilter: nf_tables: return set extensions from ->lookup() Return the extension area from the ->lookup() function to allow to consolidate common actions. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:34 +01:00
Patrick McHardy	61edafbb47	netfilter: nf_tables: consolide set element destruction With the conversion to set extensions, it is now possible to consolidate the different set element destruction functions. The set implementations' ->remove() functions are changed to only take the element out of their internal data structures. Elements will be freed in a batched fashion after the global transaction's completion RCU grace period. This reduces the amount of grace periods required for nft_hash from N to zero additional ones, additionally this guarantees that the set elements' extensions of all implementations can be used under RCU protection. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-26 11:09:34 +01:00
Hannes Frederic Sowa	5a352dd0a3	ipv6: hash net ptr into fragmentation bucket selection As namespaces are sometimes used with overlapping ip address ranges, we should also use the namespace as input to the hash to select the ip fragmentation counter bucket. Cc: Eric Dumazet <edumazet@google.com> Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:07:04 -04:00
Hannes Frederic Sowa	b6a7719aed	ipv4: hash net ptr into fragmentation bucket selection As namespaces are sometimes used with overlapping ip address ranges, we should also use the namespace as input to the hash to select the ip fragmentation counter bucket. Cc: Eric Dumazet <edumazet@google.com> Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:07:04 -04:00
David S. Miller	8fa38a38ac	Merge branch 'tipc-next' Jon Maloy says: ==================== tipc: some improvements and fixes We introduce a better algorithm for selecting when and which users should be subject to link congestion control, plus clean up some code for that mechanism. Commit #3 fixes another rare race condition during packet reception. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Jon Paul Maloy	8b4ed8634f	tipc: eliminate race condition at dual link establishment Despite recent improvements, the establishment of dual parallel links still has a small glitch where messages can bypass each other. When the second link in a dual-link configuration is established, part of the first link's traffic will be steered over to the new link. Although we do have a mechanism to ensure that packets sent before and after the establishment of the new link arrive in sequence to the destination node, this is not enough. The arriving messages will still be delivered upwards in different threads, something entailing a risk of message disordering during the transition phase. To fix this, we introduce a synchronization mechanism between the two parallel links, so that traffic arriving on the new link cannot be added to its input queue until we are guaranteed that all pre-establishment messages have been delivered on the old, parallel link. This problem seems to always have been around, but its occurrence is so rare that it has not been noticed until recent intensive testing. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Jon Paul Maloy	3127a0200d	tipc: clean up handling of link congestion After the recent changes in message importance handling it becomes possible to simplify handling of messages and sockets when we encounter link congestion. We merge the function tipc_link_cong() into link_schedule_user(), and simplify the code of the latter. The code should now be easier to follow, especially regarding return codes and handling of the message that caused the situation. In case the scheduling function is unable to pre-allocate a wakeup message buffer, it now returns -ENOBUFS, which is a more correct code than the previously used -EHOSTUNREACH. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Jon Paul Maloy	1f66d161ab	tipc: introduce starvation free send algorithm Currently, we only use a single counter; the length of the backlog queue, to determine whether a message should be accepted to the queue or not. Each time a message is being sent, the queue length is compared to a threshold value for the message's importance priority. If the queue length is beyond this threshold, the message is rejected. This algorithm implies a risk of starvation of low importance senders during very high load, because it may take a long time before the backlog queue has decreased enough to accept a lower level message. We now eliminate this risk by introducing a counter for each importance priority. When a message is sent, we check only the queue level for that particular message's priority. If that is ok, the message can be added to the backlog, irrespective of the queue level for other priorities. This way, each level is guaranteed a certain portion of the total bandwidth, and any risk of starvation is eliminated. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Guenter Roeck	b06b107a4c	net: dsa: Handle non-bridge master change Master change notifications may occur other than when joining or leaving a bridge, for example when being added to or removed from a bond or Open vSwitch. In that case, do nothing instead of asking the switch driver to remove a port from a bridge that it didn't join. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:04:57 -04:00
Patrick McHardy	fe2811ebeb	netfilter: nf_tables: convert hash and rbtree to set extensions The set implementations' private struct will only contain the elements needed to maintain the search structure, all other elements are moved to the set extensions. Element allocation and initialization is performed centrally by nf_tables_api instead of by the different set implementations' ->insert() functions. A new "elemsize" member in the set ops specifies the amount of memory to reserve for internal usage. Destruction will also be moved out of the set implementations by a following patch. Except for element allocation, the patch is a simple conversion to using data from the extension area. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:35 +01:00
Patrick McHardy	3ac4c07a24	netfilter: nf_tables: add set extensions Add simple set extension infrastructure for maintaining variable sized and optional per element data. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:34 +01:00
Patrick McHardy	bfd6e327e1	netfilter: nft_hash: convert to use rhashtable callbacks A following patch will convert sets to use so called set extensions, where the key is not located in a fixed position anymore. This will require rhashtable hashing and comparison callbacks to be used. As preparation, convert nft_hash to use these callbacks without any functional changes. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:34 +01:00
Patrick McHardy	45d84751fb	netfilter: nft_hash: indent rhashtable parameters Improve readability by indenting the parameter initialization. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:34 +01:00
Patrick McHardy	745f5450d5	netfilter: nft_hash: restore struct nft_hash Following patches will add new private members, restore struct nft_hash as preparation. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:33 +01:00
Patrick McHardy	49f7b33e63	rhashtable: provide len to obj_hashfn nftables sets will be converted to use so called setextensions, moving the key to a non-fixed position. To hash it, the obj_hashfn must be used, however it so far doesn't receive the length parameter. Pass the key length to obj_hashfn() and convert existing users. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 17:18:33 +01:00
tadeusz.struk@intel.com	ac110f4954	crypto: algif - fix warn: unsigned 'used' is never less than zero Change type from unsigned long to int to fix an issue reported by kbuild robot: crypto/algif_skcipher.c:596 skcipher_recvmsg_async() warn: unsigned 'used' is never less than zero. Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:44:18 -04:00
Ying Xue	bc14b8d6a9	tipc: fix a link reset issue due to retransmission failures When a node joins a cluster while we are transmitting a fragment stream over the broadcast link, it's missing the preceding fragments needed to build a meaningful message. As a result, the node has to drop it. However, as the fragment message is not acknowledged to its sender before it's dropped, it accidentally causes link reset of retransmission failure on the node. Reported-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:43:32 -04:00
Sebastian Ott	358e048d67	s390: fix /proc/interrupts output The irqclass_sub_desc array and enum interruption_class are out of sync thus /proc/interrupts is broken. Remove IRQIO_CLW. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:41:41 -04:00
Ying Xue	7e3ea6d5c4	sctp: avoid to repeatedly declare external variables Move the declaration for external variables to sctp.h file avoiding to repeatedly declare them with extern keyword. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:40:16 -04:00
Patrick McHardy	14d14a5d29	netfilter: nft_meta: use raw_smp_processor_id() Using smp_processor_id() triggers warnings with PREEMPT_RCU. There is no point in disabling preemption since we only collect the numeric value, so use raw_smp_processor_id() instead. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2015-03-25 12:09:40 +01:00

1 2 3 4 5 ...

508050 Commits