linux

Author	SHA1	Message	Date
David S. Miller	ae7633c841	Merge branch 'tipc-next' Ying Xue says: ==================== tipc: fix two corner issues The patch set aims at resolving the following two critical issues: Patch #1: Resolve a deadlock which happens while all links are reset Patch #2: Correct a mistake usage of RCU lock which is used to protect node list ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:40:37 -07:00
Ying Xue	8a0f6ebe84	tipc: involve reference counter for node structure TIPC node hash node table is protected with rcu lock on read side. tipc_node_find() is used to look for a node object with node address through iterating the hash node table. As the entire process of what tipc_node_find() traverses the table is guarded with rcu read lock, it's safe for us. However, when callers use the node object returned by tipc_node_find(), there is no rcu read lock applied. Therefore, this is absolutely unsafe for callers of tipc_node_find(). Now we introduce a reference counter for node structure. Before tipc_node_find() returns node object to its caller, it first increases the reference counter. Accordingly, after its caller used it up, it decreases the counter again. This can prevent a node being used by one thread from being freed by another thread. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:40:28 -07:00
Ying Xue	b952b2befb	tipc: fix potential deadlock when all links are reset [ 60.988363] ====================================================== [ 60.988754] [ INFO: possible circular locking dependency detected ] [ 60.989152] 3.19.0+ #194 Not tainted [ 60.989377] ------------------------------------------------------- [ 60.989781] swapper/3/0 is trying to acquire lock: [ 60.990079] (&(&n_ptr->lock)->rlock){+.-...}, at: [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.990743] [ 60.990743] but task is already holding lock: [ 60.991106] (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.991738] [ 60.991738] which lock already depends on the new lock. [ 60.991738] [ 60.992174] [ 60.992174] the existing dependency chain (in reverse order) is: [ 60.992174] -> #1 (&(&bclink->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] [<ffffffffa0000f57>] tipc_bclink_add_node+0x97/0xf0 [tipc] [ 60.992174] [<ffffffffa0011815>] tipc_node_link_up+0xf5/0x110 [tipc] [ 60.992174] [<ffffffffa0007783>] link_state_event+0x2b3/0x4f0 [tipc] [ 60.992174] [<ffffffffa00193c0>] tipc_link_proto_rcv+0x24c/0x418 [tipc] [ 60.992174] [<ffffffffa0008857>] tipc_rcv+0x827/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] -> #0 (&(&n_ptr->lock)->rlock){+.-...}: [ 60.992174] [<ffffffff810a8f7d>] __lock_acquire+0x163d/0x1ca0 [ 60.992174] [<ffffffff810a9c0c>] lock_acquire+0x9c/0x140 [ 60.992174] [<ffffffff8179c41f>] _raw_spin_lock_bh+0x3f/0x50 [ 60.992174] [<ffffffffa0006dca>] tipc_link_retransmit+0x1aa/0x240 [tipc] [ 60.992174] [<ffffffffa0001e11>] tipc_bclink_rcv+0x611/0x640 [tipc] [ 60.992174] [<ffffffffa0008646>] tipc_rcv+0x616/0xac0 [tipc] [ 60.992174] [<ffffffffa0002ca3>] tipc_l2_rcv_msg+0x73/0xd0 [tipc] [ 60.992174] [<ffffffff81646e66>] __netif_receive_skb_core+0x746/0x980 [ 60.992174] [<ffffffff816470c1>] __netif_receive_skb+0x21/0x70 [ 60.992174] [<ffffffff81647295>] netif_receive_skb_internal+0x35/0x130 [ 60.992174] [<ffffffff81648218>] napi_gro_receive+0x158/0x1d0 [ 60.992174] [<ffffffff81559e05>] e1000_clean_rx_irq+0x155/0x490 [ 60.992174] [<ffffffff8155c1b7>] e1000_clean+0x267/0x990 [ 60.992174] [<ffffffff81647b60>] net_rx_action+0x150/0x360 [ 60.992174] [<ffffffff8105ec43>] __do_softirq+0x123/0x360 [ 60.992174] [<ffffffff8105f12e>] irq_exit+0x8e/0xb0 [ 60.992174] [<ffffffff8179f9f5>] do_IRQ+0x65/0x110 [ 60.992174] [<ffffffff8179da6f>] ret_from_intr+0x0/0x13 [ 60.992174] [<ffffffff8100de9f>] arch_cpu_idle+0xf/0x20 [ 60.992174] [<ffffffff8109dfa6>] cpu_startup_entry+0x2f6/0x3f0 [ 60.992174] [<ffffffff81033cda>] start_secondary+0x13a/0x150 [ 60.992174] [ 60.992174] other info that might help us debug this: [ 60.992174] [ 60.992174] Possible unsafe locking scenario: [ 60.992174] [ 60.992174] CPU0 CPU1 [ 60.992174] ---- ---- [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] lock(&(&bclink->lock)->rlock); [ 60.992174] lock(&(&n_ptr->lock)->rlock); [ 60.992174] [ 60.992174] * DEADLOCK * [ 60.992174] [ 60.992174] 3 locks held by swapper/3/0: [ 60.992174] #0: (rcu_read_lock){......}, at: [<ffffffff81646791>] __netif_receive_skb_core+0x71/0x980 [ 60.992174] #1: (rcu_read_lock){......}, at: [<ffffffffa0002c35>] tipc_l2_rcv_msg+0x5/0xd0 [tipc] [ 60.992174] #2: (&(&bclink->lock)->rlock){+.-...}, at: [<ffffffffa00004be>] tipc_bclink_lock+0x8e/0xa0 [tipc] [ 60.992174] The correct the sequence of grabbing n_ptr->lock and bclink->lock should be that the former is first held and the latter is then taken, which exactly happened on CPU1. But especially when the retransmission of broadcast link is failed, bclink->lock is first held in tipc_bclink_rcv(), and n_ptr->lock is taken in link_retransmit_failure() called by tipc_link_retransmit() subsequently, which is demonstrated on CPU0. As a result, deadlock occurs. If the order of holding the two locks happening on CPU0 is reversed, the deadlock risk will be relieved. Therefore, the node lock taken in link_retransmit_failure() originally is moved to tipc_bclink_rcv() so that it's obtained before bclink lock. But the precondition of the adjustment of node lock is that responding to bclink reset event must be moved from tipc_bclink_unlock() to tipc_node_unlock(). Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:40:27 -07:00
Li RongQing	faadb05f4b	virtio: simplify the using of received in virtnet_poll received is 0, no need to minus it and use "+=" to reassign it Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:37:17 -07:00
David S. Miller	3556eaaab5	Merge branch 'be2net-next' Sathya Perla says: ==================== be2net: patch set Hi David, this patch set includes 2 feature additions to the be2net driver: Patch 1 sets up cpu affinity hints for be2net irqs using the cpumask_set_cpu_local_first() API that first picks the near numa cores and when they are exhausted, selects the far numa cores. Patch 2 setups up xps queue mapping for be2net's TXQs to avoid, by default, TX lock contention. Patch 3 just bumps up the driver version. Pls consider applying this patch set to the net-next queue. Thanks! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:34:05 -07:00
Sathya Perla	265ec92758	be2net: bump up the driver version to 10.6.0.1 Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:34:01 -07:00
Sathya Perla	73f394e629	be2net: setup xps queue mapping This patch sets up xps queue mapping on load, so that TX traffic is steered to the queue whose irqs are being processed by the current cpu. This helps in avoiding TX lock contention. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:34:01 -07:00
Padmanabh Ratnakar	d658d98aa5	be2net: assign CPU affinity hints to be2net IRQs This patch provides hints to irqbalance to map be2net IRQs to specific CPU cores. cpumask_set_cpu_local_first() is used, which first maps IRQs to near NUMA cores; when those cores are exhausted, IRQs are mapped to far NUMA cores. Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:34:00 -07:00
Eric Dumazet	41d25fe092	tcp: tcp_syn_flood_action() can be static After commit `1fb6f159fd` ("tcp: add tcp_conn_request"), tcp_syn_flood_action() is no longer used from IPv6. We can make it static, by moving it above tcp_conn_request() Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Octavian Purdila <octavian.purdila@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:17:18 -07:00
Wu Fengguang	1fb7cd4ef0	cxgb4: fix boolreturn.cocci warnings drivers/net/ethernet/chelsio/cxgb4/cxgb4_fcoe.c:49:9-10: WARNING: return of 0/1 in function 'cxgb_fcoe_sof_eof_supported' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci CC: Varun Prakash <varun@chelsio.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:15:56 -07:00
WANG Cong	85b9909272	fib6: install fib6 ops in the last step We should not commit the new ops until we finish all the setup, otherwise we have to NULL it on failure. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-29 12:12:37 -07:00
David S. Miller	7145074b37	Merge branch 'bcmgenet-next' Petri Gynther says: ==================== net: bcmgenet: multiple Rx queues support Final patch set to add support for multiple Rx queues: 1. remove priv->int0_mask and priv->int1_mask 2. modify Tx ring int_enable and int_disable vectors 3. simplify bcmgenet_init_dma() 4. tweak init_umac() 5. rework Tx NAPI code 6. rework Rx NAPI code 7. add support for multiple Rx queues ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:21 -07:00
Petri Gynther	4055eaefb3	net: bcmgenet: add support for multiple Rx queues Add support for multiple Rx queues: 1. Add NAPI context per Rx queue 2. Modify Rx interrupt and Rx NAPI code to handle multiple Rx queues Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:16 -07:00
Petri Gynther	3ab113399b	net: bcmgenet: rework Rx NAPI code Introduce new bcmgenet functions to handle the NAPI calls to: netif_napi_add() napi_enable() napi_disable() netif_napi_del() Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:15 -07:00
Petri Gynther	e2aadb4aa9	net: bcmgenet: rework Tx NAPI code Introduce new bcmgenet functions to handle the NAPI calls to: netif_napi_add() napi_enable() napi_disable() netif_napi_del() Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:14 -07:00
Petri Gynther	b2e97eca88	net: bcmgenet: tweak init_umac() Use more meaningful variable names int0_enable and int1_enable when enabling bcmgenet interrupts. For Rx default queue interrupts, use: UMAC_IRQ_RXDMA_BDONE \| UMAC_IRQ_RXDMA_PDONE For Tx default queue interrupts, use: UMAC_IRQ_TXDMA_BDONE \| UMAC_IRQ_TXDMA_PDONE Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:14 -07:00
Petri Gynther	ebbd96fb28	net: bcmgenet: simplify bcmgenet_init_dma() Do the two kcalloc() calls first, before proceeding into Rx/Tx DMA init. Makes the error case handling much simpler. Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Jaedon Shin <jaedon.shin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:13 -07:00
Petri Gynther	9dbac28fc1	net: bcmgenet: modify Tx ring int_enable and int_disable vectors Remove unnecessary function parameter priv. Use ring->priv instead. Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:13 -07:00
Petri Gynther	e412b1045c	net: bcmgenet: remove priv->int0_mask and priv->int1_mask Remove unused priv->int0_mask and priv->int1_mask. Signed-off-by: Petri Gynther <pgynther@google.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:26:12 -07:00
David S. Miller	19cc2dec8a	Merge branch 'xgene-next' Iyappan Subramanian says: ==================== drivers: net: xgene: Add separate tx completion ring SGMII based 1GbE and 10GbE interfaces support multiple interrupts. Adding separate tx completion descriptor ring and associating a dedicated irq for the TX completion. ==================== Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com>	2015-03-27 14:18:52 -07:00
Iyappan Subramanian	6772b653f8	drivers: net: xgene: Add separate tx completion ring - Added wrapper functions around napi_add, napi_del, napi_enable and napi_disable - Moved platform_get_irq function call after reading phy_mode - Associating the new irq to tx completion for the supported ethernet interfaces Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:18:48 -07:00
Iyappan Subramanian	d31346494b	dtb: xgene: Add interrupt for Tx completion Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:18:48 -07:00
Iyappan Subramanian	7e7d638ab1	Documentation: dts: xgene: Update interrupt field description Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Keyur Chudgar <kchudgar@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-27 14:18:48 -07:00
Hannes Frederic Sowa	5a352dd0a3	ipv6: hash net ptr into fragmentation bucket selection As namespaces are sometimes used with overlapping ip address ranges, we should also use the namespace as input to the hash to select the ip fragmentation counter bucket. Cc: Eric Dumazet <edumazet@google.com> Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:07:04 -04:00
Hannes Frederic Sowa	b6a7719aed	ipv4: hash net ptr into fragmentation bucket selection As namespaces are sometimes used with overlapping ip address ranges, we should also use the namespace as input to the hash to select the ip fragmentation counter bucket. Cc: Eric Dumazet <edumazet@google.com> Cc: Flavio Leitner <fbl@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:07:04 -04:00
David S. Miller	8fa38a38ac	Merge branch 'tipc-next' Jon Maloy says: ==================== tipc: some improvements and fixes We introduce a better algorithm for selecting when and which users should be subject to link congestion control, plus clean up some code for that mechanism. Commit #3 fixes another rare race condition during packet reception. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Jon Paul Maloy	8b4ed8634f	tipc: eliminate race condition at dual link establishment Despite recent improvements, the establishment of dual parallel links still has a small glitch where messages can bypass each other. When the second link in a dual-link configuration is established, part of the first link's traffic will be steered over to the new link. Although we do have a mechanism to ensure that packets sent before and after the establishment of the new link arrive in sequence to the destination node, this is not enough. The arriving messages will still be delivered upwards in different threads, something entailing a risk of message disordering during the transition phase. To fix this, we introduce a synchronization mechanism between the two parallel links, so that traffic arriving on the new link cannot be added to its input queue until we are guaranteed that all pre-establishment messages have been delivered on the old, parallel link. This problem seems to always have been around, but its occurrence is so rare that it has not been noticed until recent intensive testing. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Jon Paul Maloy	3127a0200d	tipc: clean up handling of link congestion After the recent changes in message importance handling it becomes possible to simplify handling of messages and sockets when we encounter link congestion. We merge the function tipc_link_cong() into link_schedule_user(), and simplify the code of the latter. The code should now be easier to follow, especially regarding return codes and handling of the message that caused the situation. In case the scheduling function is unable to pre-allocate a wakeup message buffer, it now returns -ENOBUFS, which is a more correct code than the previously used -EHOSTUNREACH. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Jon Paul Maloy	1f66d161ab	tipc: introduce starvation free send algorithm Currently, we only use a single counter; the length of the backlog queue, to determine whether a message should be accepted to the queue or not. Each time a message is being sent, the queue length is compared to a threshold value for the message's importance priority. If the queue length is beyond this threshold, the message is rejected. This algorithm implies a risk of starvation of low importance senders during very high load, because it may take a long time before the backlog queue has decreased enough to accept a lower level message. We now eliminate this risk by introducing a counter for each importance priority. When a message is sent, we check only the queue level for that particular message's priority. If that is ok, the message can be added to the backlog, irrespective of the queue level for other priorities. This way, each level is guaranteed a certain portion of the total bandwidth, and any risk of starvation is eliminated. Reviewed-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:05:56 -04:00
Guenter Roeck	b06b107a4c	net: dsa: Handle non-bridge master change Master change notifications may occur other than when joining or leaving a bridge, for example when being added to or removed from a bond or Open vSwitch. In that case, do nothing instead of asking the switch driver to remove a port from a bridge that it didn't join. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 14:04:57 -04:00
tadeusz.struk@intel.com	ac110f4954	crypto: algif - fix warn: unsigned 'used' is never less than zero Change type from unsigned long to int to fix an issue reported by kbuild robot: crypto/algif_skcipher.c:596 skcipher_recvmsg_async() warn: unsigned 'used' is never less than zero. Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:44:18 -04:00
Ying Xue	bc14b8d6a9	tipc: fix a link reset issue due to retransmission failures When a node joins a cluster while we are transmitting a fragment stream over the broadcast link, it's missing the preceding fragments needed to build a meaningful message. As a result, the node has to drop it. However, as the fragment message is not acknowledged to its sender before it's dropped, it accidentally causes link reset of retransmission failure on the node. Reported-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Tested-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:43:32 -04:00
Sebastian Ott	358e048d67	s390: fix /proc/interrupts output The irqclass_sub_desc array and enum interruption_class are out of sync thus /proc/interrupts is broken. Remove IRQIO_CLW. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:41:41 -04:00
Ying Xue	7e3ea6d5c4	sctp: avoid to repeatedly declare external variables Move the declaration for external variables to sctp.h file avoiding to repeatedly declare them with extern keyword. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 11:40:16 -04:00
Eric Dumazet	0144a81ccc	tcp: fix ipv4 mapped request socks ss should display ipv4 mapped request sockets like this : tcp SYN-RECV 0 0 ::ffff:192.168.0.1:8080 ::ffff:192.0.2.1:35261 and not like this : tcp SYN-RECV 0 0 192.168.0.1:8080 192.0.2.1:35261 We should init ireq->ireq_family based on listener sk_family, not the actual protocol carried by SYN packet. This means we can set ireq_family in inet_reqsk_alloc() Fixes: `3f66b083a5` ("inet: introduce ireq_family") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-25 00:57:48 -04:00
stephen hemminger	d631b94e7a	virtio: change comment in transmit The original comment was not really informative or funny as well as sexist. Replace it with a better explanation of why the driver does stop and what the impacts are. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:22:50 -04:00
Daniel Borkmann	835c3d9b7a	tools: bpf_asm: cleanup vlan extension related token We now have K_VLANT, K_VLANP and K_VLANTPID. Clean them up into more descriptive token, namely K_VLAN_TCI, K_VLAN_AVAIL and K_VLAN_TPID. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:21:41 -04:00
David S. Miller	b1275eb32e	Merge branch 'listener_refactor_16' Eric Dumazet says: ==================== tcp: listener refactor part 16 A CONFIG_PROVE_RCU=y build revealed an RCU splat I had to fix. I added const qualifiers to various md5 methods, as I expect to call them on behalf of request sock traffic even if the listener socket is not locked. This seems ok, but adding const makes the contract clearer. Note a good reduction of code size thanks to request/establish sockets convergence. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:30 -04:00
Eric Dumazet	fd3a154a00	tcp: md5: get rid of tcp_v[46]_reqsk_md5_lookup() With request socks convergence, we no longer need different lookup methods. A request socket can use generic lookup function. Add const qualifier to 2nd tcp_v[46]_md5_lookup() parameter. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:30 -04:00
Eric Dumazet	39f8e58e53	tcp: md5: remove request sock argument of calc_md5_hash() Since request and established sockets now have same base, there is no need to pass two pointers to tcp_v4_md5_hash_skb() or tcp_v6_md5_hash_skb() Also add a const qualifier to their struct tcp_md5sig_key argument. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:30 -04:00
Eric Dumazet	ff74e23f7e	tcp: md5: input path is run under rcu protected sections It is guaranteed that both tcp_v4_rcv() and tcp_v6_rcv() run from rcu read locked sections : ip_local_deliver_finish() and ip6_input_finish() both use rcu_read_lock() Also align tcp_v6_inbound_md5_hash() on tcp_v4_inbound_md5_hash() by returning a boolean. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:29 -04:00
Eric Dumazet	0980c1e308	tcp: use C99 initializers in new_state[] Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:29 -04:00
Eric Dumazet	80f03e27a3	tcp: md5: fix rcu lockdep splat While timer handler effectively runs a rcu read locked section, there is no explicit rcu_read_lock()/rcu_read_unlock() annotations and lockdep can be confused here : net/ipv4/tcp_ipv4.c-906- /* caller either holds rcu_read_lock() or socket lock */ net/ipv4/tcp_ipv4.c:907: md5sig = rcu_dereference_check(tp->md5sig_info, net/ipv4/tcp_ipv4.c-908- sock_owned_by_user(sk) \|\| net/ipv4/tcp_ipv4.c-909- lockdep_is_held(&sk->sk_lock.slock)); Let's explicitely acquire rcu_read_lock() in tcp_make_synack() Before commit `fa76ce7328` ("inet: get rid of central tcp/dccp listener timer"), we were holding listener lock so lockdep was happy. Fixes: `fa76ce7328` ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: Eric DUmazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 21:16:29 -04:00
David S. Miller	9ead3527f5	Merge branch 'rhashtable-next' Thomas Graf says: ==================== rhashtable updates on top of Herbert's work Patch 1 is a bugfix for an RCU splash I encountered while testing. Patch 2 & 3 are pure cleanups. Patch 4 disables automatic shrinking by default as discussed in previous thread. Patch 5 removes some rhashtable internal knowledge from nft_hash and fixes another RCU splash. I've pushed various rhashtable tests (Netlink, nft) together with a Makefile to a git tree [0] for easier stress testing. [0] https://github.com/tgraf/rhashtable ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:53:06 -04:00
Thomas Graf	6b6f302ced	rhashtable: Add rhashtable_free_and_destroy() rhashtable_destroy() variant which stops rehashes, iterates over the table and calls a callback to release resources. Avoids need for nft_hash to embed rhashtable internals and allows to get rid of the being_destroyed flag. It also saves a 2nd mutex lock upon destruction. Also fixes an RCU lockdep splash on nft set destruction due to calling rht_for_each_entry_safe() without holding bucket locks. Open code this loop as we need know that no mutations may occur in parallel. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:48:40 -04:00
Thomas Graf	b5e2c150ac	rhashtable: Disable automatic shrinking by default Introduce a new bool automatic_shrinking to require the user to explicitly opt-in to automatic shrinking of tables. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:48:40 -04:00
Thomas Graf	ac833bddb5	rhashtable: Mark internal/private inline functions as such Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:48:39 -04:00
Thomas Graf	299e5c32a3	rhashtable: Use 'unsigned int' consistently Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:48:39 -04:00
Thomas Graf	58be8a583d	rhashtable: Extend RCU read lock into rhashtable_insert_rehash() rhashtable_insert_rehash() requires RCU locks to be held in order to access ht->tbl and traverse to the last table. Fixes: `ccd57b1bd3` ("rhashtable: Add immediate rehash during insertion") Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 17:48:39 -04:00
Michal Sekletar	27cd545247	filter: introduce SKF_AD_VLAN_TPID BPF extension If vlan offloading takes place then vlan header is removed from frame and its contents, both vlan_tci and vlan_proto, is available to user space via TPACKET interface. However, only vlan_tci can be used in BPF filters. This commit introduces a new BPF extension. It makes possible to load the value of vlan_proto (vlan TPID) to register A. Support for classic BPF and eBPF is being added, analogous to skb->protocol. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Michal Sekletar <msekleta@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-03-24 15:25:15 -04:00

1 2 3 4 5 ...

507980 Commits