linux

Author	SHA1	Message	Date
Patrick McHardy	3a7b21eaf4	netfilter: nf_ct_sip: don't drop packets with offsets pointing outside the packet Some Cisco phones create huge messages that are spread over multiple packets. After calculating the offset of the SIP body, it is validated to be within the packet and the packet is dropped otherwise. This breaks operation of these phones. Since connection tracking is supposed to be passive, just let those packets pass unmodified and untracked. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-04-06 14:03:18 +02:00
Trond Myklebust	f05c124a70	SUNRPC: Fix a potential memory leak in rpc_new_client If the call to rpciod_up() fails, we currently leak a reference to the struct rpc_xprt. As part of the fix, we also remove the redundant check for xprt!=NULL. This is already taken care of by the callers. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-04-05 17:02:14 -04:00
Chuck Lever	a58e0be6f6	SUNRPC: Remove extra xprt_put() While testing error cases where rpc_new_client() fails, I saw some oopses. If rpc_new_client() fails, it already invokes xprt_put(). Thus __rpc_clone_client() does not need to invoke it again. Introduced by commit `1b63a751` "SUNRPC: Refactor rpc_clone_client()" Fri Sep 14, 2012. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org [>=3.7] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2013-04-05 16:58:14 -04:00
Patrick McHardy	124dff01af	netfilter: don't reset nf_trace in nf_reset() Commit `130549fe` ("netfilter: reset nf_trace in nf_reset") added code to reset nf_trace in nf_reset(). This is wrong and unnecessary. nf_reset() is used in the following cases: - when passing packets up the the socket layer, at which point we want to release all netfilter references that might keep modules pinned while the packet is queued. nf_trace doesn't matter anymore at this point. - when encapsulating or decapsulating IPsec packets. We want to continue tracing these packets after IPsec processing. - when passing packets through virtual network devices. Only devices on that encapsulate in IPv4/v6 matter since otherwise nf_trace is not used anymore. Its not entirely clear whether those packets should be traced after that, however we've always done that. - when passing packets through virtual network devices that make the packet cross network namespace boundaries. This is the only cases where we clearly want to reset nf_trace and is also what the original patch intended to fix. Add a new function nf_reset_trace() and use it in dev_forward_skb() to fix this properly. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-05 15:38:10 -04:00
Jiri Pirko	34e2ed34a0	net: ipv4: notify when address lifetime changes if userspace changes lifetime of address, send netlink notification and call notifier. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-05 00:51:12 -04:00
Eric W. Biederman	0e82e7f6df	af_unix: If we don't care about credentials coallesce all messages It was reported that the following LSB test case failed https://lsbbugs.linuxfoundation.org/attachment.cgi?id=2144 because we were not coallescing unix stream messages when the application was expecting us to. The problem was that the first send was before the socket was accepted and thus sock->sk_socket was NULL in maybe_add_creds, and the second send after the socket was accepted had a non-NULL value for sk->socket and thus we could tell the credentials were not needed so we did not bother. The unnecessary credentials on the first message cause unix_stream_recvmsg to start verifying that all messages had the same credentials before coallescing and then the coallescing failed because the second message had no credentials. Ignoring credentials when we don't care in unix_stream_recvmsg fixes a long standing pessimization which would fail to coallesce messages when reading from a unix stream socket if the senders were different even if we did not care about their credentials. I have tested this and verified that the in the LSB test case mentioned above that the messages do coallesce now, while the were failing to coallesce without this change. Reported-by: Karel Srot <ksrot@redhat.com> Reported-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-05 00:49:13 -04:00
Eric W. Biederman	25da0e3e9d	Revert "af_unix: dont send SCM_CREDENTIAL when dest socket is NULL" This reverts commit `14134f6584`. The problem that the above patch was meant to address is that af_unix messages are not being coallesced because we are sending unnecesarry credentials. Not sending credentials in maybe_add_creds totally breaks unconnected unix domain sockets that wish to send credentails to other sockets. In practice this break some versions of udev because they receive a message and the sending uid is bogus so they drop the message. Reported-by: Sven Joachim <svenjoac@gmx.de> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-05 00:49:03 -04:00
Vlad Yasevich	4543fbefe6	net: count hw_addr syncs so that unsync works properly. A few drivers use dev_uc_sync/unsync to synchronize the address lists from master down to slave/lower devices. In some cases (bond/team) a single address list is synched down to multiple devices. At the time of unsync, we have a leak in these lower devices, because "synced" is treated as a boolean and the address will not be unsynced for anything after the first device/call. Treat "synced" as a count (same as refcount) and allow all unsync calls to work. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-05 00:18:46 -04:00
David S. Miller	4f4ecd5f2a	Merge branch 'master' of git://1984.lsi.us.es/nf Pablo Neira Ayuso says: ==================== The following patchset contains netfilter updates for your net tree, they are: * Fix missing the skb->trace reset in nf_reset, noticed by Gao Feng while using the TRACE target with several net namespaces. * Fix prefix translation in IPv6 NPT if non-multiple of 32 prefixes are used, from Matthias Schiffer. * Fix invalid nfacct objects with empty name, they are now rejected with -EINVAL, spotted by Michael Zintakis, patch from myself. * A couple of fixes for wrong return values in the error path of nfnetlink_queue and nf_conntrack, from Wei Yongjun. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-04 17:41:53 -04:00
David S. Miller	518314ffe4	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into wireless John W. Linville says: ==================== Here are some more fixes intended for the 3.9 stream... Regarding the mac80211 bits, Johannes says: "I had changed the idle handling to simplify it, but broken the sequencing of commands, at least for ath9k-htc, one patch restores the sequence. The other patch fixes a crash Jouni found while stress-testing the remain-on-channel code, when an item is deleted the work struct can run twice and crash the second time." As for the iwlwifi bits, Johannes says: "The only fix here is to the passive-no-RX firmware regulatory enforcement driver support code to not drop auth frames in quick succession, leading to not being able to connect to APs on passive channels in certain circumstances." Don't forget the NFC bits, about which Samuel says: "This time we have: - A crash fix for when a DGRAM LLCP socket is listening while the NFC adapter is physically removed. - A potential double skb free when the LLCP socket receive queue is full. - A fix for properly handling multiple and consecutive LLCP connections, and not trash the socket ack log. - A build failure for the MEI microread physical layer, now that the MEI bus APIs have been merged into char-misc-next." On top of that, Stone Piao provides an mwifiex fix to avoid accessing beyond the end of a buffer. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-04 17:39:06 -04:00
John W. Linville	9306a398e7	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2013-04-03 14:19:48 -04:00
John W. Linville	407ad2b7ef	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-04-03 13:50:34 -04:00
Matthias Schiffer	906b1c394d	netfilter: ip6t_NPT: Fix translation for non-multiple of 32 prefix lengths The bitmask used for the prefix mangling was being calculated incorrectly, leading to the wrong part of the address being replaced when the prefix length wasn't a multiple of 32. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-04-03 12:24:56 +02:00
Reilly Grant	990454b5a4	VSOCK: Handle changes to the VMCI context ID. The VMCI context ID of a virtual machine may change at any time. There is a VMCI event which signals this but datagrams may be processed before this is handled. It is therefore necessary to be flexible about the destination context ID of any datagrams received. (It can be assumed to be correct because it is provided by the hypervisor.) The context ID on existing sockets should be updated to reflect how the hypervisor is currently referring to the system. Signed-off-by: Reilly Grant <grantr@vmware.com> Acked-by: Andy King <acking@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-02 14:39:17 -04:00
Balakumaran Kannan	25fb6ca4ed	net IPv6 : Fix broken IPv6 routing table after loopback down-up IPv6 Routing table becomes broken once we do ifdown, ifup of the loopback(lo) interface. After down-up, routes of other interface's IPv6 addresses through 'lo' are lost. IPv6 addresses assigned to all interfaces are routed through 'lo' for internal communication. Once 'lo' is down, those routing entries are removed from routing table. But those removed entries are not being re-created properly when 'lo' is brought up. So IPv6 addresses of other interfaces becomes unreachable from the same machine. Also this breaks communication with other machines because of NDISC packet processing failure. This patch fixes this issue by reading all interface's IPv6 addresses and adding them to IPv6 routing table while bringing up 'lo'. ==Testing== Before applying the patch: $ route -A inet6 Kernel IPv6 routing table Destination Next Hop Flag Met Ref Use If 2000::20/128 :: U 256 0 0 eth0 fe80::/64 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo ::1/128 :: Un 0 1 0 lo 2000::20/128 :: Un 0 1 0 lo fe80::xxxx:xxxx:xxxx:xxxx/128 :: Un 0 1 0 lo ff00::/8 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo $ sudo ifdown lo $ sudo ifup lo $ route -A inet6 Kernel IPv6 routing table Destination Next Hop Flag Met Ref Use If 2000::20/128 :: U 256 0 0 eth0 fe80::/64 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo ::1/128 :: Un 0 1 0 lo ff00::/8 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo $ After applying the patch: $ route -A inet6 Kernel IPv6 routing table Destination Next Hop Flag Met Ref Use If 2000::20/128 :: U 256 0 0 eth0 fe80::/64 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo ::1/128 :: Un 0 1 0 lo 2000::20/128 :: Un 0 1 0 lo fe80::xxxx:xxxx:xxxx:xxxx/128 :: Un 0 1 0 lo ff00::/8 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo $ sudo ifdown lo $ sudo ifup lo $ route -A inet6 Kernel IPv6 routing table Destination Next Hop Flag Met Ref Use If 2000::20/128 :: U 256 0 0 eth0 fe80::/64 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo ::1/128 :: Un 0 1 0 lo 2000::20/128 :: Un 0 1 0 lo fe80::xxxx:xxxx:xxxx:xxxx/128 :: Un 0 1 0 lo ff00::/8 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo $ Signed-off-by: Balakumaran Kannan <Balakumaran.Kannan@ap.sony.com> Signed-off-by: Maruthi Thotad <Maruthi.Thotad@ap.sony.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-02 14:37:19 -04:00
Vasily Averin	f0f6ee1f70	cbq: incorrect processing of high limits currently cbq works incorrectly for limits > 10% real link bandwidth, and practically does not work for limits > 50% real link bandwidth. Below are results of experiments taken on 1 Gbit link In shaper \| Actual Result -----------+--------------- 100M \| 108 Mbps 200M \| 244 Mbps 300M \| 412 Mbps 500M \| 893 Mbps This happen because of q->now changes incorrectly in cbq_dequeue(): when it is called before real end of packet transmitting, L2T is greater than real time delay, q_now gets an extra boost but never compensate it. To fix this problem we prevent change of q->now until its synchronization with real time. Signed-off-by: Vasily Averin <vvs@openvz.org> Reviewed-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-04-02 14:29:20 -04:00
John W. Linville	b0bb9b392d	This is the 2nd batch of NFC fixes for 3.9. This time we have: - A crash fix for when a DGRAM LLCP socket is listening while the NFC adapter is physically removed. - A potential double skb free when the LLCP socket receive queue is full. - A fix for properly handling multiple and consecutive LLCP connections, and not trash the socket ack log. - A build failure for the MEI microread physical layer, now that the MEI bus APIs have been merged into char-misc-next. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJRWMDxAAoJEIqAPN1PVmxKnaIP/iWPgCdp7xZGtiT8R/bf0f2I hpWtCnYgwyvqEyUqq6+hJ+A4J6DJH/vXLqO0hJIZ6lPuATdxpba9u4mWaoIuU+V5 4khspaWnTE7we72fVNhpT6Ms2ZZubtFynqbmPuxvu9vmk8JCMkJLfH8InQ4OGI0v mScDCByIq9xZiv+q/R2QO9vh6k3t5sJyAZONf7uuQhoJ+AdlrXNsvHPki+gfo+6T dLhgVGqytEDGKkh0AZLGc5Ss1MJFQsmTh76A4wstcJfLCisBtbhM0sZ8zWn6wmzQ KCq5YawbowuA12RYgb+cwxXV7H167lDYJxILZuK3WqL0nm0H3Z0jkakcV7F+IVNu iyh1hRqjdsgThl7aBF1VEER6Scum4WbEDXY5abCpX9l8Jbtk3+iQJWYpy8ldKEnL KqDiZnm6WC5CrcM1GQo4Uy46UPE0IlHPxupfQ3riuFjaFKyPd1x1ok1xUngdOXjn JYpxfMLVSsRBZ2y6tpDhm0r5oRHGmmJmKTzhMPKP+4+CVG8z0vwUQfydgaDlaFRa F3U7Av70vyCGt+Cr0tAXppvoXng1TSUKPHPTHjH+pEHjfVA9l1mAU5AYQ7JDB9aZ fccfkd5UG0f6ApFMhknI8kkISBN5KseNuuSAoQc4WV9jUVKk9mMifAz/Aj9MVYWx m6M8KzocEudWwM8y/BjE =ABey -----END PGP SIGNATURE----- Merge tag 'nfc-fixes-3.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-fixes Samuel Ortiz <sameo@linux.intel.com> says: "This is the 2nd batch of NFC fixes for 3.9. This time we have: - A crash fix for when a DGRAM LLCP socket is listening while the NFC adapter is physically removed. - A potential double skb free when the LLCP socket receive queue is full. - A fix for properly handling multiple and consecutive LLCP connections, and not trash the socket ack log. - A build failure for the MEI microread physical layer, now that the MEI bus APIs have been merged into char-misc-next." Signed-off-by: John W. Linville <linville@tuxdriver.com>	2013-04-01 15:14:22 -04:00
John W. Linville	95bc6b8461	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2013-04-01 15:09:28 -04:00
Linus Torvalds	ff3421dee6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) sadb_msg prepared for IPSEC userspace forgets to initialize the satype field, fix from Nicolas Dichtel. 2) Fix mac80211 synchronization during station removal, from Johannes Berg. 3) Fix IPSEC sequence number notifications when they wrap, from Steffen Klassert. 4) Fix cfg80211 wdev tracing crashes when add_virtual_intf() returns an error pointer, from Johannes Berg. 5) In mac80211, don't call into the channel context code with the interface list mutex held. From Johannes Berg. 6) In mac80211, if we don't actually associate, do not restart the STA timer, otherwise we can crash. From Ben Greear. 7) Missing dma_mapping_error() check in e1000, ixgb, and e1000e. From Christoph Paasch. 8) Fix sja1000 driver defines to not conflict with SH port, from Marc Kleine-Budde. 9) Don't call il4965_rs_use_green with a NULL station, from Colin Ian King. 10) Suspend/Resume in the FEC driver fail because the buffer descriptors are not initialized at all the moments in which they should. Fix from Frank Li. 11) cpsw and davinci_emac drivers both use the wrong interface to restart a stopped TX queue. Use netif_wake_queue not netif_start_queue, the latter is for initialization/bringup not active management of the queue. From Mugunthan V N. 12) Fix regression in rate calculations done by psched_ratecfg_precompute(), missing u64 type promotion. From Sergey Popovich. 13) Fix length overflow in tg3 VPD parsing, from Kees Cook. 14) AOE driver fails to allocate enough headroom, resulting in crashes. Fix from Eric Dumazet. 15) RX overflow happens too quickly in sky2 driver because pause packet thresholds are not programmed correctly. From Mirko Lindner. 16) Bonding driver manages arp_interval and miimon settings incorrectly, disabling one unintentionally disables both. Fix from Nikolay Aleksandrov. 17) smsc75xx drivers don't program the RX mac properly for jumbo frames. Fix from Steve Glendinning. 18) Fix off-by-one in Codel packet scheduler. From Vijay Subramanian. 19) Fix packet corruption in atl1c by disabling MSI support, from Hannes Frederic Sowa. 20) netdev_rx_handler_unregister() needs a synchronize_net() to fix crashes in bonding driver unload stress tests. From Eric Dumazet. 21) rxlen field of ks8851 RX packet descriptors not interpreted correctly (it is 12 bits not 16 bits, so needs to be masked after shifting the 32-bit value down 16 bits). Fix from Max Nekludov. 22) Fix missed RX/TX enable in sh_eth driver due to mishandling of link change indications. From Sergei Shtylyov. 23) Fix crashes during spurious ECI interrupts in sh_eth driver, also from Sergei Shtylyov. 24) dm9000 driver initialization is done wrong for revision B devices with DSP PHY, from Joseph CHANG. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits) DM9000B: driver initialization upgrade sh_eth: make 'link' field of 'struct sh_eth_private' int sh_eth: workaround for spurious ECI interrupt sh_eth: fix handling of no LINK signal ks8851: Fix interpretation of rxlen field. net: add a synchronize_net() in netdev_rx_handler_unregister() MAINTAINERS: Update netxen_nic maintainers list atl1e: drop pci-msi support because of packet corruption net: fq_codel: Fix off-by-one error net: calxedaxgmac: Wake-on-LAN fixes net: calxedaxgmac: fix rx ring handling when OOM net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb' smsc75xx: fix jumbo frame support net: fix the use of this_cpu_ptr bonding: fix disabling of arp_interval and miimon ipv6: don't accept node local multicast traffic from the wire sky2: Threshold for Pause Packet is set wrong sky2: Receive Overflows not counted aoe: reserve enough headroom on skbs line up comment for ndo_bridge_getlink ...	2013-04-01 08:06:30 -07:00
Artem Savkov	90e0970f87	cfg80211: sched_scan_mtx lock in cfg80211_conn_work() Introduced in `f9f475292d` ("cfg80211: always check for scan end on P2P device") cfg80211_conn_scan() which requires sched_scan_mtx to be held can be called from cfg80211_conn_work(). Without this we are hitting multiple warnings like the following: WARNING: at net/wireless/sme.c:88 cfg80211_conn_scan+0x1dc/0x3a0 [cfg80211]() Hardware name: 0578A21 Modules linked in: ... Pid: 620, comm: kworker/3:1 Not tainted 3.9.0-rc4-next-20130328+ #326 Call Trace: [<c1036992>] warn_slowpath_common+0x72/0xa0 [<c10369e2>] warn_slowpath_null+0x22/0x30 [<faa4b0ec>] cfg80211_conn_scan+0x1dc/0x3a0 [cfg80211] [<faa4b344>] cfg80211_conn_do_work+0x94/0x380 [cfg80211] [<faa4c3b2>] cfg80211_conn_work+0xa2/0x130 [cfg80211] [<c1051858>] process_one_work+0x198/0x450 Signed-off-by: Artem Savkov <artem.savkov@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-30 22:33:19 +01:00
Eric Dumazet	00cfec3748	net: add a synchronize_net() in netdev_rx_handler_unregister() commit `35d48903e9` (bonding: fix rx_handler locking) added a race in bonding driver, reported by Steven Rostedt who did a very good diagnosis : <quoting Steven> I'm currently debugging a crash in an old 3.0-rt kernel that one of our customers is seeing. The bug happens with a stress test that loads and unloads the bonding module in a loop (I don't know all the details as I'm not the one that is directly interacting with the customer). But the bug looks to be something that may still be present and possibly present in mainline too. It will just be much harder to trigger it in mainline. In -rt, interrupts are threads, and can schedule in and out just like any other thread. Note, mainline now supports interrupt threads so this may be easily reproducible in mainline as well. I don't have the ability to tell the customer to try mainline or other kernels, so my hands are somewhat tied to what I can do. But according to a core dump, I tracked down that the eth irq thread crashed in bond_handle_frame() here: slave = bond_slave_get_rcu(skb->dev); bond = slave->bond; <--- BUG the slave returned was NULL and accessing slave->bond caused a NULL pointer dereference. Looking at the code that unregisters the handler: void netdev_rx_handler_unregister(struct net_device *dev) { ASSERT_RTNL(); RCU_INIT_POINTER(dev->rx_handler, NULL); RCU_INIT_POINTER(dev->rx_handler_data, NULL); } Which is basically: dev->rx_handler = NULL; dev->rx_handler_data = NULL; And looking at __netif_receive_skb() we have: rx_handler = rcu_dereference(skb->dev->rx_handler); if (rx_handler) { if (pt_prev) { ret = deliver_skb(skb, pt_prev, orig_dev); pt_prev = NULL; } switch (rx_handler(&skb)) { My question to all of you is, what stops this interrupt from happening while the bonding module is unloading? What happens if the interrupt triggers and we have this: CPU0 CPU1 ---- ---- rx_handler = skb->dev->rx_handler netdev_rx_handler_unregister() { dev->rx_handler = NULL; dev->rx_handler_data = NULL; rx_handler() bond_handle_frame() { slave = skb->dev->rx_handler; bond = slave->bond; <-- NULL pointer dereference!!! What protection am I missing in the bond release handler that would prevent the above from happening? </quoting Steven> We can fix bug this in two ways. First is adding a test in bond_handle_frame() and others to check if rx_handler_data is NULL. A second way is adding a synchronize_net() in netdev_rx_handler_unregister() to make sure that a rcu protected reader has the guarantee to see a non NULL rx_handler_data. The second way is better as it avoids an extra test in fast path. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jpirko@redhat.com> Cc: Paul E. McKenney <paulmck@us.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-29 15:38:15 -04:00
Vijay Subramanian	cd68ddd4c2	net: fq_codel: Fix off-by-one error Currently, we hold a max of sch->limit -1 number of packets instead of sch->limit packets. Fix this off-by-one error. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-29 15:32:23 -04:00
Shmulik Ladkani	a561cf7edf	net: core: Remove redundant call to 'nf_reset' in 'dev_forward_skb' 'nf_reset' is called just prior calling 'netif_rx'. No need to call it twice. Reported-by: Igor Michailov <rgohita@gmail.com> Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-29 15:25:28 -04:00
Li RongQing	50eab0503a	net: fix the use of this_cpu_ptr flush_tasklet is not percpu var, and percpu is percpu var, and this_cpu_ptr(&info->cache->percpu->flush_tasklet) is not equal to &this_cpu_ptr(info->cache->percpu)->flush_tasklet 1f743b076(use this_cpu_ptr per-cpu helper) introduced this bug. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-29 15:13:27 -04:00
Hannes Frederic Sowa	1c4a154e52	ipv6: don't accept node local multicast traffic from the wire Erik Hugne's errata proposal (Errata ID: 3480) to RFC4291 has been verified: http://www.rfc-editor.org/errata_search.php?eid=3480 We have to check for pkt_type and loopback flag because either the packets are allowed to travel over the loopback interface (in which case pkt_type is PACKET_HOST and IFF_LOOPBACK flag is set) or they travel over a non-loopback interface back to us (in which case PACKET_TYPE is PACKET_LOOPBACK and IFF_LOOPBACK flag is not set). Cc: Erik Hugne <erik.hugne@ericsson.com> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-29 14:57:33 -04:00
Linus Torvalds	2c3de1c2d7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull userns fixes from Eric W Biederman: "The bulk of the changes are fixing the worst consequences of the user namespace design oversight in not considering what happens when one namespace starts off as a clone of another namespace, as happens with the mount namespace. The rest of the changes are just plain bug fixes. Many thanks to Andy Lutomirski for pointing out many of these issues." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Restrict when proc and sysfs can be mounted ipc: Restrict mounting the mqueue filesystem vfs: Carefully propogate mounts across user namespaces vfs: Add a mount flag to lock read only bind mounts userns: Don't allow creation if the user is chrooted yama: Better permission check for ptraceme pid: Handle the exit of a multi-threaded init. scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids.	2013-03-28 13:43:46 -07:00
John W. Linville	630a216da6	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-03-28 14:40:06 -04:00
David S. Miller	0fb031f036	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== 1) Initialize the satype field in key_notify_policy_flush(), this was left uninitialized. From Nicolas Dichtel. 2) The sequence number difference for replay notifications was misscalculated on ESN sequence number wrap. We need a separate replay notify function for esn. 3) Fix an off by one in the esn replay notify function. From Mathias Krause. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-27 14:07:04 -04:00
Sergey Popovich	ea872d7712	sch: add missing u64 in psched_ratecfg_precompute() It seems that commit commit `292f1c7ff6` Author: Jiri Pirko <jiri@resnulli.us> Date: Tue Feb 12 00:12:03 2013 +0000 sch: make htb_rate_cfg and functions around that generic adds little regression. Before: # tc qdisc add dev eth0 root handle 1: htb default ffff # tc class add dev eth0 classid 1:ffff htb rate 5Gbit # tc -s class show dev eth0 class htb 1:ffff root prio 0 rate 5000Mbit ceil 5000Mbit burst 625b cburst 625b Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 lended: 0 borrowed: 0 giants: 0 tokens: 31 ctokens: 31 After: # tc qdisc add dev eth0 root handle 1: htb default ffff # tc class add dev eth0 classid 1:ffff htb rate 5Gbit # tc -s class show dev eth0 class htb 1:ffff root prio 0 rate 1544Mbit ceil 1544Mbit burst 625b cburst 625b Sent 5073 bytes 41 pkt (dropped 0, overlimits 0 requeues 0) rate 1976bit 2pps backlog 0b 0p requeues 0 lended: 41 borrowed: 0 giants: 0 tokens: 1802 ctokens: 1802 This probably due to lost u64 cast of rate parameter in psched_ratecfg_precompute() (net/sched/sch_generic.c). Signed-off-by: Sergey Popovich <popovich_sergei@mail.ru> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-27 14:06:41 -04:00
Wei Yongjun	fcca143d69	rtnetlink: fix error return code in rtnl_link_fill() Fix to return a negative error code from the error handling case instead of 0(possible overwrite to 0 by ops->fill_xstats call), as returned elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-27 14:06:40 -04:00
Wei Yongjun	5389090b59	netfilter: nf_conntrack: fix error return code Fix to return a negative error code from the error handling case instead of 0, as returned elsewhere in function nf_conntrack_standalone_init(). Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-03-27 18:31:17 +01:00
Hong Zhiguo	d3e1101c9b	openvswitch: correct an invalid BUG_ON table->count is uint32_t Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Signed-off-by: Jesse Gross <jesse@nicira.com>	2013-03-27 09:07:41 -07:00
Jesse Gross	a9341512c3	openvswitch: Preallocate reply skb in ovs_vport_cmd_set(). Allocation of the Netlink notification skb can potentially fail after changing vport configuration. In general, we try to avoid this by undoing any change we made but that is difficult for existing objects. This avoids the problem by preallocating the buffer (which is fixed size). Signed-off-by: Jesse Gross <jesse@nicira.com>	2013-03-27 09:07:40 -07:00
Linus Torvalds	b175293ccc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Always increment IPV4 ID field in encapsulated GSO packets, even when DF is set. Regression fix from Pravin B Shelar. 2) Fix per-net subsystem initialization in netfilter conntrack, otherwise we may access dynamically allocated memory before it is actually allocated. From Gao Feng. 3) Fix DMA buffer lengths in iwl3945 driver, from Stanislaw Gruszka. 4) Fix race between submission of sync vs async commands in mwifiex driver, from Amitkumar Karwar. 5) Add missing cancel of command timer in mwifiex driver, from Bing Zhao. 6) Missing SKB free in rtlwifi USB driver, from Jussi Kivilinna. 7) Thermal layer tries to use a genetlink multicast string that is longer than the 16 character limit. Fix it and add a BUG check to prevent this kind of thing from happening in the future. From Masatake YAMATO. 8) Fix many bugs in the handling of the teardown of L2TP connections, UDP encapsulation instances, and sockets. From Tom Parkin. 9) Missing socket release in IRDA, from Kees Cook. 10) Fix fec driver modular build, from Fabio Estevam. 11) Erroneous use of kfree() instead of free_netdev() in lantiq_etop, from Wei Yongjun. 12) Fix bugs in handling of queue numbers and steering rules in mlx4 driver, from Moshe Lazer, Hadar Hen Zion, and Or Gerlitz. 13) Some FOO_DIAG_MAX constants were defined off by one, fix from Andrey Vagin. 14) TCP segmentation deferral is unintentionally done too strongly, breaking ACK clocking. Fix from Eric Dumazet. 15) net_enable_timestamp() can legitimately be invoked from software interrupts, and in a way that is safe, so remove the WARN_ON(). Also from Eric Dumazet. 16) Fix use after free in VLANs, from Cong Wang. 17) Fix TCP slow start retransmit storms after SACK reneging, from Yuchung Cheng. 18) Unix socket release should mark a socket dead before NULL'ing out sock->sk, otherwise we can race. Fix from Paul Moore. 19) IPV6 addrconf code can try to free static memory, from Hong Zhiguo. 20) Fix register mis-programming, NULL pointer derefs, and wrong PHC clock frequency in IGB driver. From Lior LevyAlex Williamson, Jiri Benc, and Jeff Kirsher. 21) skb->ip_summed logic in pch_gbe driver is reversed, breaking packet forwarding. Fix from Veaceslav Falico. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits) ipv4: Fix ip-header identification for gso packets. bonding: remove already created master sysfs link on failure af_unix: dont send SCM_CREDENTIAL when dest socket is NULL pch_gbe: fix ip_summed checksum reporting on rx igb: fix PHC stopping on max freq igb: make sensor info static igb: SR-IOV init reordering igb: Fix null pointer dereference igb: fix i350 anti spoofing config ixgbevf: don't release the soft entries ipv6: fix bad free of addrconf_init_net unix: fix a race condition in unix_release() tcp: undo spurious timeout after SACK reneging bnx2x: fix assignment of signed expression to unsigned variable bridge: fix crash when set mac address of br interface 8021q: fix a potential use-after-free net: remove a WARN_ON() in net_enable_timestamp() tcp: preserve ACK clocking in TSO net: fix *_DIAG_MAX constants net/mlx4_core: Disallow releasing VF QPs which have steering rules ...	2013-03-26 14:24:29 -07:00
Linus Torvalds	5d538483ea	NFS client bugfixes for Linux 3.9 - Fix an NFSv4 idmapper regression - Fix an Oops in the pNFS blocks client - Fix up various issues with pNFS layoutcommit - Ensure correct read ordering of variables in rpc_wake_up_task_queue_locked -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJRUedyAAoJEGcL54qWCgDyar0P/2pTT/yxX8ejTu5DmY7e4PYJ jhPG2AEqY/yMLn9GvB375VIs1L8tuY50+3NFhWZFjyNbEU3GV+5Y+kPpBtAgYiSI VyIXiJ/xMtXdYJMYuE/nh5jbcqJsHwGjpcIaSd5BuWzQUaoUYvLulxWd4QN8mmaT 5SuzmgV+7WIqV6RjlaYF82srcOKAjwemcrfRkCNzzJr6aT39gH2YdYFbDaTr7qhU fw0x3QlI7887vSNQcfaGbC1+jr6oe8wRCneOR0tceU/8bcj6zlUDk5HxqSOc28mA jUQieoVRggcM4s5DFpNcuwW6qCPZOmzv/OFD6oqnhyyonPOrue+7zaoujZmGNmjx dT2V/jQehanYD25WpDO8OyFXUeYE4x9bgHKsszhBTwr4x5D8ceEJ1sugcOPiTTxu tflbbuWbt+BguvXp4p8QayUj0V2cplM/nOovWyUG+BH46sz3Dtv46NOgJeO2a29g T6jayxmKCxvtPKtG0j34BzLngiKabZTSEhFms6Qarp9lwWvHWrR9KWGuDBNvy1Ts GMBN8P6Ib40yVi6Pwlj5Jpy6yLKVklHtJQpactr63AZmYrF4bBBSom+MWAh3X1iO QtF0x9Z1bBkXY2Q/u+3vWMxQtEPeW+pSiloj8aiceFAt33zKM+1bLofDhEw0s2fI wJEHYsGyGtDQINgP0v1e =OPbZ -----END PGP SIGNATURE----- Merge tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Fix an NFSv4 idmapper regression - Fix an Oops in the pNFS blocks client - Fix up various issues with pNFS layoutcommit - Ensure correct read ordering of variables in rpc_wake_up_task_queue_locked * tag 'nfs-for-3.9-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked NFSv4.1: Add a helper pnfs_commit_and_return_layout NFSv4.1: Always clear the NFS_INO_LAYOUTCOMMIT in layoutreturn NFSv4.1: Fix a race in pNFS layoutcommit pnfs-block: removing DM device maybe cause oops when call dev_remove NFSv4: Fix the string length returned by the idmapper	2013-03-26 14:23:45 -07:00
Pravin B Shelar	330305cc4a	ipv4: Fix ip-header identification for gso packets. ip-header id needs to be incremented even if IP_DF flag is set. This behaviour was changed in commit `490ab08127` (IP_GRE: Fix IP-Identification). Following patch fixes it so that identification is always incremented. Reported-by: Cong Wang <amwang@redhat.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-26 13:50:05 -04:00
dingtianhong	14134f6584	af_unix: dont send SCM_CREDENTIAL when dest socket is NULL SCM_SCREDENTIALS should apply to write() syscalls only either source or destination socket asserted SOCK_PASSCRED. The original implememtation in maybe_add_creds is wrong, and breaks several LSB testcases ( i.e. /tset/LSB.os/netowkr/recvfrom/T.recvfrom). Origionally-authored-by: Karel Srot <ksrot@redhat.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-26 12:33:55 -04:00
Samuel Ortiz	39a352a5b5	NFC: llcp: Keep the connected socket parent pointer alive And avoid decreasing the ack log twice when dequeueing connected LLCP sockets. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-03-26 14:35:57 +01:00
John W. Linville	fae172136c	Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211	2013-03-25 14:50:17 -04:00
Hong Zhiguo	a79ca223e0	ipv6: fix bad free of addrconf_init_net Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-25 13:55:38 -04:00
Paul Moore	ded34e0fe8	unix: fix a race condition in unix_release() As reported by Jan, and others over the past few years, there is a race condition caused by unix_release setting the sock->sk pointer to NULL before properly marking the socket as dead/orphaned. This can cause a problem with the LSM hook security_unix_may_send() if there is another socket attempting to write to this partially released socket in between when sock->sk is set to NULL and it is marked as dead/orphaned. This patch fixes this by only setting sock->sk to NULL after the socket has been marked as dead; I also take the opportunity to make unix_release_sock() a void function as it only ever returned 0/success. Dave, I think this one should go on the -stable pile. Special thanks to Jan for coming up with a reproducer for this problem. Reported-by: Jan Stancek <jan.stancek@gmail.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-25 13:11:48 -04:00
Johannes Berg	382a103b2b	mac80211: fix idle handling sequence Corey Richardson reported that my idle handling cleanup (commit `fd0f979a1b`, "mac80211: simplify idle handling") broke ath9k_htc. The reason appears to be that it wants to go out of idle before switching channels. To fix it, reimplement that sequence. Reported-by: Corey Richardson <corey@octayn.net> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-25 16:50:18 +01:00
Trond Myklebust	1166fde6a9	SUNRPC: Add barriers to ensure read ordering in rpc_wake_up_task_queue_locked We need to be careful when testing task->tk_waitqueue in rpc_wake_up_task_queue_locked, because it can be changed while we are holding the queue->lock. By adding appropriate memory barriers, we can ensure that it is safe to test task->tk_waitqueue for equality if the RPC_TASK_QUEUED bit is set. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org	2013-03-25 11:23:40 -04:00
Pablo Neira Ayuso	deadcfc332	netfilter: nfnetlink_acct: return -EINVAL if object name is empty If user-space tries to create accounting object with an empty name, then return -EINVAL. Reported-by: Michael Zintakis <michael.zintakis@googlemail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-03-25 14:21:30 +01:00
Wei Yongjun	558724a5b2	netfilter: nfnetlink_queue: fix error return code in nfnetlink_queue_init() Fix to return a negative error code from the error handling case instead of 0, as returned elsewhere in this function. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-03-25 14:21:27 +01:00
Johannes Berg	3fbd45ca8d	mac80211: fix remain-on-channel cancel crash If a ROC item is canceled just as it expires, the work struct may be scheduled while it is running (and waiting for the mutex). This results in it being run after being freed, which obviously crashes. To fix this don't free it when aborting is requested but instead mark it as "to be freed", which makes the work a no-op and allows freeing it outside. Cc: stable@vger.kernel.org [3.6+] Reported-by: Jouni Malinen <j@w1.fi> Tested-by: Jouni Malinen <j@w1.fi> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-25 13:50:33 +01:00
Mathias Krause	799ef90c55	xfrm: Fix esn sequence number diff calculation in xfrm_replay_notify_esn() Commit `0017c0b` "xfrm: Fix replay notification for esn." is off by one for the sequence number wrapped case as UINT_MAX is 0xffffffff, not 0x100000000. ;) Just calculate the diff like done everywhere else in the file. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-03-25 07:25:50 +01:00
Yuchung Cheng	7ebe183c6d	tcp: undo spurious timeout after SACK reneging On SACK reneging the sender immediately retransmits and forces a timeout but disables Eifel (undo). If the (buggy) receiver does not drop any packet this can trigger a false slow-start retransmit storm driven by the ACKs of the original packets. This can be detected with undo and TCP timestamps. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-24 17:27:28 -04:00
Hong zhi guo	9b46922e15	bridge: fix crash when set mac address of br interface When I tried to set mac address of a bridge interface to a mac address which already learned on this bridge, I got system hang. The cause is straight forward: function br_fdb_change_mac_address calls fdb_insert with NULL source nbp. Then an fdb lookup is performed. If an fdb entry is found and it's local, it's OK. But if it's not local, source is dereferenced for printk without NULL check. Signed-off-by: Hong Zhiguo <honkiko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-24 17:27:28 -04:00
Cong Wang	4a7df340ed	8021q: fix a potential use-after-free vlan_vid_del() could possibly free ->vlan_info after a RCU grace period, however, we may still refer to the freed memory area by 'grp' pointer. Found by code inspection. This patch moves vlan_vid_del() as behind as possible. Cc: Patrick McHardy <kaber@trash.net> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-24 17:27:28 -04:00
Eric Dumazet	9979a55a83	net: remove a WARN_ON() in net_enable_timestamp() The WARN_ON(in_interrupt()) in net_enable_timestamp() can get false positive, in socket clone path, run from softirq context : [ 3641.624425] WARNING: at net/core/dev.c:1532 net_enable_timestamp+0x7b/0x80() [ 3641.668811] Call Trace: [ 3641.671254] <IRQ> [<ffffffff80286817>] warn_slowpath_common+0x87/0xc0 [ 3641.677871] [<ffffffff8028686a>] warn_slowpath_null+0x1a/0x20 [ 3641.683683] [<ffffffff80742f8b>] net_enable_timestamp+0x7b/0x80 [ 3641.689668] [<ffffffff80732ce5>] sk_clone_lock+0x425/0x450 [ 3641.695222] [<ffffffff8078db36>] inet_csk_clone_lock+0x16/0x170 [ 3641.701213] [<ffffffff807ae449>] tcp_create_openreq_child+0x29/0x820 [ 3641.707663] [<ffffffff807d62e2>] ? ipt_do_table+0x222/0x670 [ 3641.713354] [<ffffffff807aaf5b>] tcp_v4_syn_recv_sock+0xab/0x3d0 [ 3641.719425] [<ffffffff807af63a>] tcp_check_req+0x3da/0x530 [ 3641.724979] [<ffffffff8078b400>] ? inet_hashinfo_init+0x60/0x80 [ 3641.730964] [<ffffffff807ade6f>] ? tcp_v4_rcv+0x79f/0xbe0 [ 3641.736430] [<ffffffff807ab9bd>] tcp_v4_do_rcv+0x38d/0x4f0 [ 3641.741985] [<ffffffff807ae14a>] tcp_v4_rcv+0xa7a/0xbe0 Its safe at this point because the parent socket owns a reference on the netstamp_needed, so we cant have a 0 -> 1 transition, which requires to lock a mutex. Instead of refining the check, lets remove it, as all known callers are safe. If it ever changes in the future, static_key_slow_inc() will complain anyway. Reported-by: Laurent Chavey <chavey@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-24 17:27:27 -04:00
Ben Greear	370bd00593	mac80211: Don't restart sta-timer if not associated. I found another crash when deleting lots of virtual stations in a congested environment. I think the problem is that the ieee80211_mlme_notify_scan_completed could call ieee80211_restart_sta_timer for a stopped interface that was about to be deleted. With the following patch I am unable to reproduce the crash. Signed-off-by: Ben Greear <greearb@candelatech.com> [move check, also make the same change in mesh] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-24 11:15:59 +01:00
Johannes Berg	f9f475292d	cfg80211: always check for scan end on P2P device If a P2P device wdev is removed while it has a scan, then the scan completion might crash later as it is already freed by that time. To avoid the crash always check the scan completion when the P2P device is being removed for some reason. If the driver already canceled it, don't want and free it, otherwise warn and leak it to avoid later crashes. In order to do this, locking needs to be changed away from the rdev mutex (which can't always be guaranteed). For now, use the sched_scan_mtx instead, I'll rename it to just scan_mtx in a later patch. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-24 11:15:58 +01:00
Eric Dumazet	f4541d60a4	tcp: preserve ACK clocking in TSO A long standing problem with TSO is the fact that tcp_tso_should_defer() rearms the deferred timer, while it should not. Current code leads to following bad bursty behavior : 20:11:24.484333 IP A > B: . 297161:316921(19760) ack 1 win 119 20:11:24.484337 IP B > A: . ack 263721 win 1117 20:11:24.485086 IP B > A: . ack 265241 win 1117 20:11:24.485925 IP B > A: . ack 266761 win 1117 20:11:24.486759 IP B > A: . ack 268281 win 1117 20:11:24.487594 IP B > A: . ack 269801 win 1117 20:11:24.488430 IP B > A: . ack 271321 win 1117 20:11:24.489267 IP B > A: . ack 272841 win 1117 20:11:24.490104 IP B > A: . ack 274361 win 1117 20:11:24.490939 IP B > A: . ack 275881 win 1117 20:11:24.491775 IP B > A: . ack 277401 win 1117 20:11:24.491784 IP A > B: . 316921:332881(15960) ack 1 win 119 20:11:24.492620 IP B > A: . ack 278921 win 1117 20:11:24.493448 IP B > A: . ack 280441 win 1117 20:11:24.494286 IP B > A: . ack 281961 win 1117 20:11:24.495122 IP B > A: . ack 283481 win 1117 20:11:24.495958 IP B > A: . ack 285001 win 1117 20:11:24.496791 IP B > A: . ack 286521 win 1117 20:11:24.497628 IP B > A: . ack 288041 win 1117 20:11:24.498459 IP B > A: . ack 289561 win 1117 20:11:24.499296 IP B > A: . ack 291081 win 1117 20:11:24.500133 IP B > A: . ack 292601 win 1117 20:11:24.500970 IP B > A: . ack 294121 win 1117 20:11:24.501388 IP B > A: . ack 295641 win 1117 20:11:24.501398 IP A > B: . 332881:351881(19000) ack 1 win 119 While the expected behavior is more like : 20:19:49.259620 IP A > B: . 197601:202161(4560) ack 1 win 119 20:19:49.260446 IP B > A: . ack 154281 win 1212 20:19:49.261282 IP B > A: . ack 155801 win 1212 20:19:49.262125 IP B > A: . ack 157321 win 1212 20:19:49.262136 IP A > B: . 202161:206721(4560) ack 1 win 119 20:19:49.262958 IP B > A: . ack 158841 win 1212 20:19:49.263795 IP B > A: . ack 160361 win 1212 20:19:49.264628 IP B > A: . ack 161881 win 1212 20:19:49.264637 IP A > B: . 206721:211281(4560) ack 1 win 119 20:19:49.265465 IP B > A: . ack 163401 win 1212 20:19:49.265886 IP B > A: . ack 164921 win 1212 20:19:49.266722 IP B > A: . ack 166441 win 1212 20:19:49.266732 IP A > B: . 211281:215841(4560) ack 1 win 119 20:19:49.267559 IP B > A: . ack 167961 win 1212 20:19:49.268394 IP B > A: . ack 169481 win 1212 20:19:49.269232 IP B > A: . ack 171001 win 1212 20:19:49.269241 IP A > B: . 215841:221161(5320) ack 1 win 119 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Van Jacobson <vanj@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-22 10:34:03 -04:00
Johannes Berg	8b305780ed	mac80211: fix virtual monitor interface locking The virtual monitor interface has a locking issue, it calls into the channel context code with the iflist mutex held which isn't allowed since it is usually acquired the other way around. The mutex is still required for the interface iteration, but need not be held across the channel calls. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-20 22:26:35 +01:00
Johannes Berg	ce1eadda6b	cfg80211: fix wdev tracing crash Arend reported a crash in tracing if the driver returns an ERR_PTR() value from the add_virtual_intf() callback. This is due to the tracing then still attempting to dereference the "pointer", fix this by using IS_ERR_OR_NULL(). Reported-by: Arend van Spriel <arend@broadcom.com> Tested-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-20 22:21:31 +01:00
John W. Linville	b9d5319041	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-03-20 14:26:37 -04:00
Kees Cook	896ee0eee6	net/irda: add missing error path release_sock call This makes sure that release_sock is called for all error conditions in irda_getsockopt. Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Brad Spengler <spender@grsecurity.net> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:23:13 -04:00
Martin Fuzzey	283951f95b	ipconfig: Fix newline handling in log message. When using ipconfig the logs currently look like: Single name server: [ 3.467270] IP-Config: Complete: [ 3.470613] device=eth0, hwaddr=ac🇩🇪48:00:00:01, ipaddr=172.16.42.2, mask=255.255.255.0, gw=172.16.42.1 [ 3.480670] host=infigo-1, domain=, nis-domain=(none) [ 3.486166] bootserver=172.16.42.1, rootserver=172.16.42.1, rootpath= [ 3.492910] nameserver0=172.16.42.1[ 3.496853] ALSA device list: Three name servers: [ 3.496949] IP-Config: Complete: [ 3.500293] device=eth0, hwaddr=ac🇩🇪48:00:00:01, ipaddr=172.16.42.2, mask=255.255.255.0, gw=172.16.42.1 [ 3.510367] host=infigo-1, domain=, nis-domain=(none) [ 3.515864] bootserver=172.16.42.1, rootserver=172.16.42.1, rootpath= [ 3.522635] nameserver0=172.16.42.1, nameserver1=172.16.42.100 [ 3.529149] , nameserver2=172.16.42.200 Fix newline handling for these cases Signed-off-by: Martin Fuzzey <mfuzzey@parkeon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:15:58 -04:00
Daniel Borkmann	8ed781668d	flow_keys: include thoff into flow_keys for later usage In skb_flow_dissect(), we perform a dissection of a skbuff. Since we're doing the work here anyway, also store thoff for a later usage, e.g. in the BPF filter. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:14:36 -04:00
Tom Parkin	f6e16b299b	l2tp: unhash l2tp sessions on delete, not on free If we postpone unhashing of l2tp sessions until the structure is freed, we risk: 1. further packets arriving and getting queued while the pseudowire is being closed down 2. the recv path hitting "scheduling while atomic" errors in the case that recv drops the last reference to a session and calls l2tp_session_free while in atomic context As such, l2tp sessions should be unhashed from l2tp_core data structures early in the teardown process prior to calling pseudowire close. For pseudowires like l2tp_ppp which have multiple shutdown codepaths, provide an unhash hook. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:39 -04:00
Tom Parkin	7b7c0719cd	l2tp: avoid deadlock in l2tp stats update l2tp's u64_stats writers were incorrectly synchronised, making it possible to deadlock a 64bit machine running a 32bit kernel simply by sending the l2tp code netlink commands while passing data through l2tp sessions. Previous discussion on netdev determined that alternative solutions such as spinlock writer synchronisation or per-cpu data would bring unjustified overhead, given that most users interested in high volume traffic will likely be running 64bit kernels on 64bit hardware. As such, this patch replaces l2tp's use of u64_stats with atomic_long_t, thereby avoiding the deadlock. Ref: http://marc.info/?l=linux-netdev&m=134029167910731&w=2 http://marc.info/?l=linux-netdev&m=134079868111131&w=2 Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:39 -04:00
Tom Parkin	cf2f5c886a	l2tp: push all ppp pseudowire shutdown through .release handler If userspace deletes a ppp pseudowire using the netlink API, either by directly deleting the session or by deleting the tunnel that contains the session, we need to tear down the corresponding pppox channel. Rather than trying to manage two pppox unbind codepaths, switch the netlink and l2tp_core session_close handlers to close via. the l2tp_ppp socket .release handler. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:39 -04:00
Tom Parkin	4c6e2fd354	l2tp: purge session reorder queue on delete Add calls to l2tp_session_queue_purge as a part of l2tp_tunnel_closeall and l2tp_session_delete. Pseudowire implementations which are deleted only via. l2tp_core l2tp_session_delete calls can dispense with their own code for flushing the reorder queue. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:39 -04:00
Tom Parkin	48f72f92b3	l2tp: add session reorder queue purge function to core If an l2tp session is deleted, it is necessary to delete skbs in-flight on the session's reorder queue before taking it down. Rather than having each pseudowire implementation reaching into the l2tp_session struct to handle this itself, provide a function in l2tp_core to purge the session queue. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:39 -04:00
Tom Parkin	02d13ed5f9	l2tp: don't BUG_ON sk_socket being NULL It is valid for an existing struct sock object to have a NULL sk_socket pointer, so don't BUG_ON in l2tp_tunnel_del_work if that should occur. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:39 -04:00
Tom Parkin	8abbbe8ff5	l2tp: take a reference for kernel sockets in l2tp_tunnel_sock_lookup When looking up the tunnel socket in struct l2tp_tunnel, hold a reference whether the socket was created by the kernel or by userspace. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:38 -04:00
Tom Parkin	2b551c6e7d	l2tp: close sessions before initiating tunnel delete When a user deletes a tunnel using netlink, all the sessions in the tunnel should also be deleted. Since running sessions will pin the tunnel socket with the references they hold, have the l2tp_tunnel_delete close all sessions in a tunnel before finally closing the tunnel socket. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:38 -04:00
Tom Parkin	936063175a	l2tp: close sessions in ip socket destroy callback l2tp_core hooks UDP's .destroy handler to gain advance warning of a tunnel socket being closed from userspace. We need to do the same thing for IP-encapsulation sockets. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:38 -04:00
Tom Parkin	e34f4c7050	l2tp: export l2tp_tunnel_closeall l2tp_core internally uses l2tp_tunnel_closeall to close all sessions in a tunnel when a UDP-encapsulation socket is destroyed. We need to do something similar for IP-encapsulation sockets. Export l2tp_tunnel_closeall as a GPL symbol to enable l2tp_ip and l2tp_ip6 to call it from their .destroy handlers. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:38 -04:00
Tom Parkin	9980d001ce	l2tp: add udp encap socket destroy handler L2TP sessions hold a reference to the tunnel socket to prevent it going away while sessions are still active. However, since tunnel destruction is handled by the sock sk_destruct callback there is a catch-22: a tunnel with sessions cannot be deleted since each session holds a reference to the tunnel socket. If userspace closes a managed tunnel socket, or dies, the tunnel will persist and it will be neccessary to individually delete the sessions using netlink commands. This is ugly. To prevent this occuring, this patch leverages the udp encapsulation socket destroy callback to gain early notification when the tunnel socket is closed. This allows us to safely close the sessions running in the tunnel, dropping the tunnel socket references in the process. The tunnel socket is then destroyed as normal, and the tunnel resources deallocated in sk_destruct. While we're at it, ensure that l2tp_tunnel_closeall correctly drops session references to allow the sessions to be deleted rather than leaking. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:38 -04:00
Tom Parkin	44046a593e	udp: add encap_destroy callback Users of udp encapsulation currently have an encap_rcv callback which they can use to hook into the udp receive path. In situations where a encapsulation user allocates resources associated with a udp encap socket, it may be convenient to be able to also hook the proto .destroy operation. For example, if an encap user holds a reference to the udp socket, the destroy hook might be used to relinquish this reference. This patch adds a socket destroy hook into udp, which is set and enabled in the same way as the existing encap_rcv hook. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:10:38 -04:00
Masatake YAMATO	f1e79e2080	genetlink: trigger BUG_ON if a group name is too long Trigger BUG_ON if a group name is longer than GENL_NAMSIZ. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 12:05:51 -04:00
Thierry Escande	b315515544	NFC: llcp: Remove possible double call to kfree_skb kfree_skb was called twice when the socket receive queue is full Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-03-20 16:46:40 +01:00
David S. Miller	90b2621fd4	Merge branch 'master' of git://1984.lsi.us.es/nf Pablo Neira Ayuso says: ==================== The following patchset contains 7 Netfilter/IPVS fixes for 3.9-rc, they are: * Restrict IPv6 stateless NPT targets to the mangle table. Many users are complaining that this target does not work in the nat table, which is the wrong table for it, from Florian Westphal. * Fix possible use before initialization in the netns init path of several conntrack protocol trackers (introduced recently while improving conntrack netns support), from Gao Feng. * Fix incorrect initialization of copy_range in nfnetlink_queue, spotted by Eric Dumazet during the NFWS2013, patch from myself. * Fix wrong calculation of next SCTP chunk in IPVS, from Julian Anastasov. * Remove rcu_read_lock section in IPVS while calling ipv4_update_pmtu not required anymore after change introduced in 3.7, again from Julian. * Fix SYN looping in IPVS state sync if the backup is used a real server in DR/TUN modes, this required a new /proc entry to disable the director function when acting as backup, also from Julian. * Remove leftover IP_NF_QUEUE Kconfig after ip_queue removal, noted by Paul Bolle. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-20 10:23:52 -04:00
Steffen Klassert	0017c0b575	xfrm: Fix replay notification for esn. We may miscalculate the sequence number difference from the last time we send a notification if a sequence number wrap occured in the meantime. We fix this by adding a separate replay notify function for esn. Here we take the high bits of the sequence number into account to calculate the difference. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2013-03-20 11:57:52 +01:00
Samuel Ortiz	bec964ed3b	NFC: llcp: Detach socket from process context only when releasing the socket Calling sock_orphan when e.g. the NFC adapter is removed can lead to kernel crashes when e.g. a connection less client is sleeping on the Rx workqueue, waiting for data to show up. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-03-20 11:30:37 +01:00
Paul Bolle	3dd6664fac	netfilter: remove unused "config IP_NF_QUEUE" Kconfig symbol IP_NF_QUEUE is unused since commit `d16cf20e2f` ("netfilter: remove ip_queue support"). Let's remove it too. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-03-20 00:11:43 +01:00
Linus Torvalds	7b1b3fd74e	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix ARM BPF JIT handling of negative 'k' values, from Chen Gang. 2) Insufficient space reserved for bridge netlink values, fix from Stephen Hemminger. 3) Some dst_neigh_lookup() callers don't interpret error pointer correctly, fix from Zhouyi Zhou. 4) Fix transport match in SCTP active_path loops, from Xugeng Zhang. 5) Fix qeth driver handling of multi-order SKB frags, from Frank Blaschka. 6) fec driver is missing napi_disable() call, resulting in crashes on unload, from Georg Hofmann. 7) Don't try to handle PMTU events on a listening socket, fix from Eric Dumazet. 8) Fix timestamp location calculations in IP option processing, from David Ward. 9) FIB_TABLE_HASHSZ setting is not controlled by the correct kconfig tests, from Denis V Lunev. 10) Fix TX descriptor push handling in SFC driver, from Ben Hutchings. 11) Fix isdn/hisax and tulip/de4x5 kconfig dependencies, from Arnd Bergmann. 12) bnx2x statistics don't handle 4GB rollover correctly, fix from Maciej Żenczykowski. 13) Openvswitch bug fixes for vport del/new error reporting, missing genlmsg_end() call in netlink processing, and mis-parsing of LLC/SNAP ethernet types. From Rich Lane. 14) SKB pfmemalloc state should only be propagated from the head page of a compound page, fix from Pavel Emelyanov. 15) Fix link handling in tg3 driver for 5715 chips when autonegotation is disabled. From Nithin Sujir. 16) Fix inverted test of cpdma_check_free_tx_desc return value in davinci_emac driver, from Mugunthan V N. 17) vlan_depth is incorrectly calculated in skb_network_protocol(), from Li RongQing. 18) Fix probing of Gobi 1K devices in qmi_wwan driver, and fix NCM device mode backwards compat in cdc_ncm driver. From Bjørn Mork. git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits) inet: limit length of fragment queue hash table bucket lists qeth: Fix scatter-gather regression qeth: Fix invalid router settings handling qeth: delay feature trace tcp: dont handle MTU reduction on LISTEN socket bnx2x: fix occasional statistics off-by-4GB error vhost/net: fix heads usage of ubuf_info bridge: Add support for setting BR_ROOT_BLOCK flag. bnx2x: add missing napi deletion in error path drivers: net: ethernet: ti: davinci_emac: fix usage of cpdma_check_free_tx_desc() ethernet/tulip: DE4x5 needs VIRT_TO_BUS isdn: hisax: netjet requires VIRT_TO_BUS net: cdc_ncm, cdc_mbim: allow user to prefer NCM for backwards compatibility rtnetlink: Mask the rta_type when range checking Revert "ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally" Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug smsc75xx: configuration help incorrectly mentions smsc95xx net: fec: fix missing napi_disable call net: fec: restart the FEC when PHY speed changes skb: Propagate pfmemalloc on skb from head page only ...	2013-03-19 13:20:51 -07:00
Hannes Frederic Sowa	5a3da1fe95	inet: limit length of fragment queue hash table bucket lists This patch introduces a constant limit of the fragment queue hash table bucket list lengths. Currently the limit 128 is choosen somewhat arbitrary and just ensures that we can fill up the fragment cache with empty packets up to the default ip_frag_high_thresh limits. It should just protect from list iteration eating considerable amounts of cpu. If we reach the maximum length in one hash bucket a warning is printed. This is implemented on the caller side of inet_frag_find to distinguish between the different users of inet_fragment.c. I dropped the out of memory warning in the ipv4 fragment lookup path, because we already get a warning by the slab allocator. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jesper Dangaard Brouer <jbrouer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-19 10:28:36 -04:00
Julian Anastasov	bf93ad72cd	ipvs: remove extra rcu lock In 3.7 we added code that uses ipv4_update_pmtu but after commit `c5ae7d4192` (ipv4: must use rcu protection while calling fib_lookup) the RCU lock is not needed. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-03-19 21:21:52 +09:00
Julian Anastasov	0c12582fbc	ipvs: add backup_only flag to avoid loops Dmitry Akindinov is reporting for a problem where SYNs are looping between the master and backup server when the backup server is used as real server in DR mode and has IPVS rules to function as director. Even when the backup function is enabled we continue to forward traffic and schedule new connections when the current master is using the backup server as real server. While this is not a problem for NAT, for DR and TUN method the backup server can not determine if a request comes from client or from director. To avoid such loops add new sysctl flag backup_only. It can be needed for DR/TUN setups that do not need backup and director function at the same time. When the backup function is enabled we stop any forwarding and pass the traffic to the local stack (real server mode). The flag disables the director function when the backup function is enabled. For setups that enable backup function for some virtual services and director function for other virtual services there should be another more complex solution to support DR/TUN mode, may be to assign per-virtual service syncid value, so that we can differentiate the requests. Reported-by: Dmitry Akindinov <dimak@stalker.com> Tested-by: German Myzovsky <lawyer@sipnet.ru> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-03-19 21:21:51 +09:00
Julian Anastasov	cf2e39429c	ipvs: fix sctp chunk length order Fix wrong but non-fatal access to chunk length. sch->length should be in network order, next chunk should be aligned to 4 bytes. Problem noticed in sparse output. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>	2013-03-19 09:37:27 +09:00
John W. Linville	8fa48cbdfb	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth	2013-03-18 15:17:11 -04:00
Eric Dumazet	0d4f060861	tcp: dont handle MTU reduction on LISTEN socket When an ICMP ICMP_FRAG_NEEDED (or ICMPV6_PKT_TOOBIG) message finds a LISTEN socket, and this socket is currently owned by the user, we set TCP_MTU_REDUCED_DEFERRED flag in listener tsq_flags. This is bad because if we clone the parent before it had a chance to clear the flag, the child inherits the tsq_flags value, and next tcp_release_cb() on the child will decrement sk_refcnt. Result is that we might free a live TCP socket, as reported by Dormando. IPv4: Attempt to release TCP socket in state 1 Fix this issue by testing sk_state against TCP_LISTEN early, so that we set TCP_MTU_REDUCED_DEFERRED on appropriate sockets (not a LISTEN one) This bug was introduced in commit `563d34d057` (tcp: dont drop MTU reduction indications) Reported-by: dormando <dormando@rydia.net> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-18 13:31:28 -04:00
Eric W. Biederman	92f28d973c	scm: Require CAP_SYS_ADMIN over the current pidns to spoof pids. Don't allow spoofing pids over unix domain sockets in the corner cases where a user has created a user namespace but has not yet created a pid namespace. Cc: stable@vger.kernel.org Reported-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-17 17:16:16 -07:00
Vlad Yasevich	3d84fa98ac	bridge: Add support for setting BR_ROOT_BLOCK flag. Most of the support was already there. The only thing that was missing was the call to set the flag. Add this call. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-17 12:41:29 -04:00
David S. Miller	c62dce6126	Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== On the NFC bits, Samuel says: "With this one we have: - A fix for properly decreasing socket ack log. - A timer and works cleanup upon NFC device removal. - A monitoroing socket cleanup round from llcp_socket_release. - A proper error report to pending sockets upon NFC device removal." Regarding the Bluetooth bits, Gustavo says: "I have these two patches for 3.9, these add support for two more devices to the bluetooth drivers." Along with those, we have a few wireless driver fixes... Bing Zhao provides an mwifiex to prevent an out-of-bounds memory access. John Crispin offers a Kconfig fix to enable some otherwise dead code in rt2x00. The correct symbols were added in -rc1 through a different tree, but the symbols for enabling the wireless driver didn't match. Larry Finger brings an rtlwifi fix for a scheduling while atomic bug, and another fix for a reassociation problem caused by failing to clear the BSSID after a disconnect. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-17 12:26:18 -04:00
Vlad Yasevich	a5b8db9144	rtnetlink: Mask the rta_type when range checking Range/validity checks on rta_type in rtnetlink_rcv_msg() do not account for flags that may be set. This causes the function to return -EINVAL when flags are set on the type (for example NLA_F_NESTED). Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-17 11:43:40 -04:00
Timo Teräs	8c6216d7f1	Revert "ip_gre: make ipgre_tunnel_xmit() not parse network header as IP unconditionally" This reverts commit `412ed94744`. The commit is wrong as tiph points to the outer IPv4 header which is installed at ipgre_header() and not the inner one which is protocol dependant. This commit broke succesfully opennhrp which use PF_PACKET socket with ETH_P_NHRP protocol. Additionally ssl_addr is set to the link-layer IPv4 address. This address is written by ipgre_header() to the skb earlier, and this is the IPv4 header tiph should point to - regardless of the inner protocol payload. Signed-off-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-16 23:00:41 -04:00
John W. Linville	f3a3440063	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-03-15 10:44:36 -04:00
David S. Miller	296b60109e	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== A few different bug fixes, including several for issues with userspace communication that have gone unnoticed up until now. These are intended for net/3.9. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-15 09:00:39 -04:00
Florian Westphal	a82783c91d	netfilter: ip6t_NPT: restrict to mangle table As the translation is stateless, using it in nat table doesn't work (only initial packet is translated). filter table OUTPUT works but won't re-route the packet after translation. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-03-15 12:58:21 +01:00
Pablo Neira Ayuso	bae99f7a1d	netfilter: nfnetlink_queue: fix incorrect initialization of copy range field 2^16 = 0xffff, not 0xfffff (note the extra 'f'). Not dangerous since you adjust it to min_t(data_len, skb->len) just after on. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-03-15 12:35:49 +01:00
Gao feng	0d98da5d84	netfilter: nf_conntrack: register pernet subsystem before register L4 proto In (`c296bb4` netfilter: nf_conntrack: refactor l4proto support for netns) the l4proto gre/dccp/udplite/sctp registration happened before the pernet subsystem, which is wrong. Register pernet subsystem before register L4proto since after register L4proto, init_conntrack may try to access the resources which allocated in register_pernet_subsys. Reported-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2013-03-15 12:29:25 +01:00
Vinicius Costa Gomes	eb20ff9c91	Bluetooth: Fix not closing SCO sockets in the BT_CONNECT2 state With deferred setup for SCO, it is possible that userspace closes the socket when it is in the BT_CONNECT2 state, after the Connect Request is received but before the Accept Synchonous Connection is sent. If this happens the following crash was observed, when the connection is terminated: [ +0.000003] hci_sync_conn_complete_evt: hci0 status 0x10 [ +0.000005] sco_connect_cfm: hcon ffff88003d1bd800 bdaddr 40:98:4e:32:d7:39 status 16 [ +0.000003] sco_conn_del: hcon ffff88003d1bd800 conn ffff88003cc8e300, err 110 [ +0.000015] BUG: unable to handle kernel NULL pointer dereference at 0000000000000199 [ +0.000906] IP: [<ffffffff810620dd>] __lock_acquire+0xed/0xe82 [ +0.000000] PGD 3d21f067 PUD 3d291067 PMD 0 [ +0.000000] Oops: 0002 [#1] SMP [ +0.000000] Modules linked in: rfcomm bnep btusb bluetooth [ +0.000000] CPU 0 [ +0.000000] Pid: 1481, comm: kworker/u:2H Not tainted 3.9.0-rc1-25019-gad82cdd #1 Bochs Bochs [ +0.000000] RIP: 0010:[<ffffffff810620dd>] [<ffffffff810620dd>] __lock_acquire+0xed/0xe82 [ +0.000000] RSP: 0018:ffff88003c3c19d8 EFLAGS: 00010002 [ +0.000000] RAX: 0000000000000001 RBX: 0000000000000246 RCX: 0000000000000000 [ +0.000000] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003d1be868 [ +0.000000] RBP: ffff88003c3c1a98 R08: 0000000000000002 R09: 0000000000000000 [ +0.000000] R10: ffff88003d1be868 R11: ffff88003e20b000 R12: 0000000000000002 [ +0.000000] R13: ffff88003aaa8000 R14: 000000000000006e R15: ffff88003d1be850 [ +0.000000] FS: 0000000000000000(0000) GS:ffff88003e200000(0000) knlGS:0000000000000000 [ +0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ +0.000000] CR2: 0000000000000199 CR3: 000000003c1cb000 CR4: 00000000000006b0 [ +0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ +0.000000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ +0.000000] Process kworker/u:2H (pid: 1481, threadinfo ffff88003c3c0000, task ffff88003aaa8000) [ +0.000000] Stack: [ +0.000000] ffffffff81b16342 0000000000000000 0000000000000000 ffff88003d1be868 [ +0.000000] ffffffff00000000 00018c0c7863e367 000000003c3c1a28 ffffffff8101efbd [ +0.000000] 0000000000000000 ffff88003e3d2400 ffff88003c3c1a38 ffffffff81007c7a [ +0.000000] Call Trace: [ +0.000000] [<ffffffff8101efbd>] ? kvm_clock_read+0x34/0x3b [ +0.000000] [<ffffffff81007c7a>] ? paravirt_sched_clock+0x9/0xd [ +0.000000] [<ffffffff81007fd4>] ? sched_clock+0x9/0xb [ +0.000000] [<ffffffff8104fd7a>] ? sched_clock_local+0x12/0x75 [ +0.000000] [<ffffffff810632d1>] lock_acquire+0x93/0xb1 [ +0.000000] [<ffffffffa0022339>] ? spin_lock+0x9/0xb [bluetooth] [ +0.000000] [<ffffffff8105f3d8>] ? lock_release_holdtime.part.22+0x4e/0x55 [ +0.000000] [<ffffffff814f6038>] _raw_spin_lock+0x40/0x74 [ +0.000000] [<ffffffffa0022339>] ? spin_lock+0x9/0xb [bluetooth] [ +0.000000] [<ffffffff814f6936>] ? _raw_spin_unlock+0x23/0x36 [ +0.000000] [<ffffffffa0022339>] spin_lock+0x9/0xb [bluetooth] [ +0.000000] [<ffffffffa00230cc>] sco_conn_del+0x76/0xbb [bluetooth] [ +0.000000] [<ffffffffa002391d>] sco_connect_cfm+0x2da/0x2e9 [bluetooth] [ +0.000000] [<ffffffffa000862a>] hci_proto_connect_cfm+0x38/0x65 [bluetooth] [ +0.000000] [<ffffffffa0008d30>] hci_sync_conn_complete_evt.isra.79+0x11a/0x13e [bluetooth] [ +0.000000] [<ffffffffa000cd96>] hci_event_packet+0x153b/0x239d [bluetooth] [ +0.000000] [<ffffffff814f68ff>] ? _raw_spin_unlock_irqrestore+0x48/0x5c [ +0.000000] [<ffffffffa00025f6>] hci_rx_work+0xf3/0x2e3 [bluetooth] [ +0.000000] [<ffffffff8103efed>] process_one_work+0x1dc/0x30b [ +0.000000] [<ffffffff8103ef83>] ? process_one_work+0x172/0x30b [ +0.000000] [<ffffffff8103e07f>] ? spin_lock_irq+0x9/0xb [ +0.000000] [<ffffffff8103fc8d>] worker_thread+0x123/0x1d2 [ +0.000000] [<ffffffff8103fb6a>] ? manage_workers+0x240/0x240 [ +0.000000] [<ffffffff81044211>] kthread+0x9d/0xa5 [ +0.000000] [<ffffffff81044174>] ? __kthread_parkme+0x60/0x60 [ +0.000000] [<ffffffff814f75bc>] ret_from_fork+0x7c/0xb0 [ +0.000000] [<ffffffff81044174>] ? __kthread_parkme+0x60/0x60 [ +0.000000] Code: d7 44 89 8d 50 ff ff ff 4c 89 95 58 ff ff ff e8 44 fc ff ff 44 8b 8d 50 ff ff ff 48 85 c0 4c 8b 95 58 ff ff ff 0f 84 7a 04 00 00 <f0> ff 80 98 01 00 00 83 3d 25 41 a7 00 00 45 8b b5 e8 05 00 00 [ +0.000000] RIP [<ffffffff810620dd>] __lock_acquire+0xed/0xe82 [ +0.000000] RSP <ffff88003c3c19d8> [ +0.000000] CR2: 0000000000000199 [ +0.000000] ---[ end trace e73cd3b52352dd34 ]--- Cc: stable@vger.kernel.org [3.8] Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org> Tested-by: Frederic Dalleau <frederic.dalleau@intel.com> Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>	2013-03-14 13:14:21 -03:00
Eric Dumazet	16fad69cfe	tcp: fix skb_availroom() Chrome OS team reported a crash on a Pixel ChromeBook in TCP stack : https://code.google.com/p/chromium/issues/detail?id=182056 commit `a21d45726a` (tcp: avoid order-1 allocations on wifi and tx path) did a poor choice adding an 'avail_size' field to skb, while what we really needed was a 'reserved_tailroom' one. It would have avoided commit `22b4a4f22d` (tcp: fix retransmit of partially acked frames) and this commit. Crash occurs because skb_split() is not aware of the 'avail_size' management (and should not be aware) Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Mukesh Agrawal <quiche@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-14 11:49:45 -04:00
Linus Torvalds	aea8b5d1e5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace bugfixes from Eric Biederman: "This tree includes a partial revert for "fs: Limit sys_mount to only request filesystem modules." When I added the new style module aliases to the filesystems I deleted the old ones. A bad move. It turns out that distributions like Arch linux use module aliases when constructing ramdisks. Which meant ultimately that an ext3 filesystem mounted with ext4 would not result in the ext4 module being put into the ramdisk. The other change in this tree adds a handful of filesystem module alias I simply failed to add the first time. Which inconvinienced a few folks using cifs. I don't want to inconvinience folks any longer than I have to so here are these trivial fixes." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: fs: Readd the fs module aliases. fs: Limit sys_mount to only request filesystem modules. (Part 3)	2013-03-13 15:47:50 -07:00
Xufeng Zhang	2317f449af	sctp: don't break the loop while meeting the active_path so as to find the matched transport sctp_assoc_lookup_tsn() function searchs which transport a certain TSN was sent on, if not found in the active_path transport, then go search all the other transports in the peer's transport_addr_list, however, we should continue to the next entry rather than break the loop when meet the active_path transport. Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-13 10:09:55 -04:00
Vlad Yasevich	f281563350	sctp: Use correct sideffect command in duplicate cookie handling When SCTP is done processing a duplicate cookie chunk, it tries to delete a newly created association. For that, it has to set the right association for the side-effect processing to work. However, when it uses the SCTP_CMD_NEW_ASOC command, that performs more work then really needed (like hashing the associationa and assigning it an id) and there is no point to do that only to delete the association as a next step. In fact, it also creates an impossible condition where an association may be found by the getsockopt() call, and that association is empty. This causes a crash in some sctp getsockopts. The solution is rather simple. We simply use SCTP_CMD_SET_ASOC command that doesn't have all the overhead and does exactly what we need. Reported-by: Karl Heiss <kheiss@gmail.com> Tested-by: Karl Heiss <kheiss@gmail.com> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-13 09:59:21 -04:00
Eric W. Biederman	fa7614ddd6	fs: Readd the fs module aliases. I had assumed that the only use of module aliases for filesystems prior to "fs: Limit sys_mount to only request filesystem modules." was in request_module. It turns out I was wrong. At least mkinitcpio in Arch linux uses these aliases. So readd the preexising aliases, to keep from breaking userspace. Userspace eventually will have to follow and use the same aliases the kernel does. So at some point we may be delete these aliases without problems. However that day is not today. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-12 18:55:21 -07:00
Linus Torvalds	368edaadc0	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fix from Sage Weil: "This fixes a bug in the new message decoding that just went in during the last window." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: fix decoding of pgids	2013-03-12 09:22:42 -07:00
Linus Torvalds	5b22b1848b	Merge branch 'for-3.9' of git://linux-nfs.org/~bfields/linux Pull nfsd bugfixes from Bruce Fields: "Some minor fallout from the user-namespace work broke most krb5 mounts to nfsd, and I screwed up a change to the AF_LOCAL rpc code." * 'for-3.9' of git://linux-nfs.org/~bfields/linux: sunrpc: don't attempt to cancel unitialized work nfsd: fix krb5 handling of anonymous principals	2013-03-12 09:20:58 -07:00
Li RongQing	c80a8512ee	net/core: move vlan_depth out of while loop in skb_network_protocol() [ Bug added added in commit `05e8ef4ab2` (net: factor out skb_mac_gso_segment() from skb_gso_segment() ) ] move vlan_depth out of while loop, or else vlan_depth always is ETH_HLEN, can not be increased, and lead to infinite loop when frame has two vlan headers. Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-12 11:47:40 -04:00
stephen hemminger	3da889b616	bridge: reserve space for IFLA_BRPORT_FAST_LEAVE The bridge multicast fast leave feature was added sufficient space was not reserved in the netlink message. This means the flag may be lost in netlink events and results of queries. Found by observation while looking up some netlink stuff for discussion with Vlad. Problem introduced by commit `c2d3babfaf` Author: David S. Miller <davem@davemloft.net> Date: Wed Dec 5 16:24:45 2012 -0500 bridge: implement multicast fast leave Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-12 05:38:29 -04:00
David S. Miller	2230e0c193	Included changes ares: - fix packet parsing routine to avoid to read beyond the packet boundary -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABCAAGBQJRPlQRAAoJEADl0hg6qKeOP44QAI20iYAaMmA40lrsELAAJMVJ G/hHyKYcup0nxrXnlj9yOft6bYx0232/TjhGKGQ7eYcl+ri+Wyu96kC1hJG9rr/Z +WCU4CimTY5MRVzFKwNriaiqyAsW2cw2T1k1KfZD9Wb9t6hEdvd8f+4DbXYrYxHG nSZQKKDD0cxs1ARScOEGbf7KF8sw6RcGWj0m4xM00Wo/fai+CZZX/HLcUnHQrQxx 4w9safvaIVuQV3mANTpSoerfkraNzaX14i2ZU5SGi2/mhR9PC4JyGz5FIge+fuvp rP/E40GdCYpcuDL7UAyd+IBaOoiP6llDUJA/LqbZLyEZgkMtt8rgQwBsmcYDtiTt zmqCgwjp2/mTs44LfuxtxvLcIDRsQh52I0ceZaAzflG3m9t5eYs6L7oyEEUtOSCm wwY+RmBdMrArr8dohkxopjxAJtCLuHxC8e9AfXwzqt8FYZIQG/oayBrgtEoxCgzf PnJWX0uw4m6WisvMN5Ko8bNeacVRyceTqTOpIWxbdF0wku2evCbkxkK6PfvRDAca UKyrLfbDH59OObhq3fEov7wiNjLJo92bV6dLqTGgQp/GQBTswttb+9WwjT6+PE8b dKRM1eKlCBDMT5q4tGSzKvoH6cfC+h7GIPmpNDdcG+ByJI4bLyzDxLs6ZzYoyYP9 plB7HO/1r1pnOE/vtqXE =X3VY -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included changes ares: - fix packet parsing routine to avoid to read beyond the packet boundary Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-12 05:36:52 -04:00
David Ward	4660c7f498	net/ipv4: Ensure that location of timestamp option is stored This is needed in order to detect if the timestamp option appears more than once in a packet, to remove the option if the packet is fragmented, etc. My previous change neglected to store the option location when the router addresses were prespecified and Pointer > Length. But now the option location is also stored when Flag is an unrecognized value, to ensure these option handling behaviors are still performed. Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-12 05:35:39 -04:00
Marek Lindner	b47506d912	batman-adv: verify tt len does not exceed packet len batadv_iv_ogm_process() accesses the packet using the tt_num_changes attribute regardless of the real packet len (assuming the length check was done before). Therefore a length check is needed to avoid reading random memory. Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>	2013-03-11 22:59:47 +01:00
Sage Weil	d6c0dd6b0c	libceph: fix decoding of pgids In `4f6a7e5ee1` we effectively dropped support for the legacy encoding for the OSDMap and incremental. However, we didn't fix the decoding for the pgid. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>	2013-03-11 14:31:00 -07:00
Linus Torvalds	0cb7750825	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Missing cancel of work items in mac80211 MLME, from Ben Greear. 2) Fix DMA mapping handling in iwlwifi by using coherent DMA for command headers, from Johannes Berg. 3) Decrease the amount of pressure on the page allocator by using order 1 pages less in iwlwifi, from Emmanuel Grumbach. 4) Fix mesh PS broadcast OOPS in mac80211, from Marco Porsch. 5) Don't forget to recalculate idle state in mac80211 monitor interface, from Felix Fietkau. 6) Fix varargs in netfilter conntrack handler, from Joe Perches. 7) Need to reset entire chip when command queue fills up in iwlwifi, from Emmanuel Grumbach. 8) The TX antenna value must be valid when calibrations are performed in iwlwifi, fix from Dor Shaish. 9) Don't generate netfilter audit log entries when audit is disabled, from Gao Feng. 10) Deal with DMA unit hang on e1000e during power state transitions, from Bruce Allan. 11) Remove BUILD_BUG_ON check from igb driver, from Alexander Duyck. 12) Fix lockdep warning on i2c handling of igb driver, from Carolyn Wyborny. 13) Fix several TTY handling issues in IRDA ircomm tty driver, from Peter Hurley. 14) Several QFQ packet scheduler fixes from Paolo Valente. 15) When VXLAN encapsulates on transmit, we have to reset the netfilter state. From Zang MingJie. 16) Fix jiffie check in net_rx_action() so that we really cap the processing at 2HZ. From Eric Dumazet. 17) Fix erroneous trigger of IP option space exhaustion, when routers are pre-specified and we are looking to see if we can insert a timestamp, we will have the space. From David Ward. 18) Fix various issues in benet driver wrt waiting for firmware to finish POST after resets or errors. From Gavin Shan and Sathya Perla. 19) Fix TX locking in SFC driver, from Ben Hutchings. 20) Like the VXLAN fix above, when we encap in a TUN device we have to reset the netfilter state. This should fix several strange crashes reported by Dave Jones and others. From Eric Dumazet. 21) Don't forget to clean up MAC address resources when shutting down a port in mlx4 driver, from Yan Burman. 22) Fix divide by zero in vmxnet3 driver, from Bhavesh Davda. 23) Fix device statistic regression in tg3 when the driver is using phylib, from Nithin Sujir. 24) Fix info leak in several netlink handlers, from Mathias Krause. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (79 commits) 6lowpan: Fix endianness issue in is_addr_link_local(). rrunner.c: fix possible memory leak in rr_init_one() dcbnl: fix various netlink info leaks rtnl: fix info leak on RTM_GETLINK request for VF devices bridge: fix mdb info leaks tg3: Update link_up flag for phylib devices ipv6: stop multicast forwarding to process interface scoped addresses bridging: fix rx_handlers return code netlabel: fix build problems when CONFIG_IPV6=n drivers/isdn: checkng length to be sure not memory overflow net/rds: zero last byte for strncpy bnx2x: Fix SFP+ misconfiguration in iSCSI boot scenario bnx2x: Fix intermittent long KR2 link up time macvlan: Set IFF_UNICAST_FLT flag to prevent unnecessary promisc mode. team: unsyc the devices addresses when port is removed bridge: add missing vid to br_mdb_get() Fix: sparse warning in inet_csk_prepare_forced_close afkey: fix a typo MAINTAINERS: Update qlcnic maintainers list netlabel: correctly list all the static label mappings ...	2013-03-11 07:51:59 -07:00
Johannes Berg	07e5a5f5ab	mac80211: fix crash with P2P Device returning action frames If a P2P Device interface receives an unhandled action frame, we attempt to return it. This crashes because it doesn't have a channel context. Fix the crash by using status->band and properly mark the return frame as an off-channel frame. Reported-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-11 09:37:50 +02:00
YOSHIFUJI Hideaki / 吉藤英明	9026c49272	6lowpan: Fix endianness issue in is_addr_link_local(). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-10 16:49:35 -04:00
Mathias Krause	29cd8ae0e1	dcbnl: fix various netlink info leaks The dcb netlink interface leaks stack memory in various places: * perm_addr[] buffer is only filled at max with 12 of the 32 bytes but copied completely, * no in-kernel driver fills all fields of an IEEE 802.1Qaz subcommand, so we're leaking up to 58 bytes for ieee_ets structs, up to 136 bytes for ieee_pfc structs, etc., * the same is true for CEE -- no in-kernel driver fills the whole struct, Prevent all of the above stack info leaks by properly initializing the buffers/structures involved. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-10 05:19:26 -04:00
Mathias Krause	84d73cd3fb	rtnl: fix info leak on RTM_GETLINK request for VF devices Initialize the mac address buffer with 0 as the driver specific function will probably not fill the whole buffer. In fact, all in-kernel drivers fill only ETH_ALEN of the MAX_ADDR_LEN bytes, i.e. 6 of the 32 possible bytes. Therefore we currently leak 26 bytes of stack memory to userland via the netlink interface. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-10 05:19:26 -04:00
Mathias Krause	c085c49920	bridge: fix mdb info leaks The bridging code discloses heap and stack bytes via the RTM_GETMDB netlink interface and via the notify messages send to group RTNLGRP_MDB afer a successful add/del. Fix both cases by initializing all unset members/padding bytes with memset(0). Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-10 05:19:25 -04:00
Linus Torvalds	72932611b4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace bugfixes from Eric Biederman: "This is three simple fixes against 3.9-rc1. I have tested each of these fixes and verified they work correctly. The userns oops in key_change_session_keyring and the BUG_ON triggered by proc_ns_follow_link were found by Dave Jones. I am including the enhancement for mount to only trigger requests of filesystem modules here instead of delaying this for the 3.10 merge window because it is both trivial and the kind of change that tends to bit-rot if left untouched for two months." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc: Use nd_jump_link in proc_ns_follow_link fs: Limit sys_mount to only request filesystem modules (Part 2). fs: Limit sys_mount to only request filesystem modules. userns: Stop oopsing in key_change_session_keyring	2013-03-09 16:51:13 -08:00
J. Bruce Fields	190b1ecf25	sunrpc: don't attempt to cancel unitialized work As of `dc107402ae` "SUNRPC: make AF_LOCAL connect synchronous", we no longer initialize connect_worker in the AF_LOCAL case, resulting in warnings like: WARNING: at lib/debugobjects.c:261 debug_print_object+0x8c/0xb0() Hardware name: Bochs ODEBUG: assert_init not available (active state 0) object type: timer_list hint: stub_timer+0x0/0x20 Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl lockd sunrpc Pid: 4816, comm: nfsd Tainted: G W 3.8.0-rc2-00049-gdc10740 #801 Call Trace: [<ffffffff8156ec00>] ? free_obj_work+0x60/0xa0 [<ffffffff81046aaf>] warn_slowpath_common+0x7f/0xc0 [<ffffffff81046ba6>] warn_slowpath_fmt+0x46/0x50 [<ffffffff8156eccc>] debug_print_object+0x8c/0xb0 [<ffffffff81055030>] ? timer_debug_hint+0x10/0x10 [<ffffffff8156f7e3>] debug_object_assert_init+0xe3/0x120 [<ffffffff81057ebb>] del_timer+0x2b/0x80 [<ffffffff8109c4e6>] ? mark_held_locks+0x86/0x110 [<ffffffff81065a29>] try_to_grab_pending+0xd9/0x150 [<ffffffff81065b57>] __cancel_work_timer+0x27/0xc0 [<ffffffff81065c03>] cancel_delayed_work_sync+0x13/0x20 [<ffffffffa0007067>] xs_destroy+0x27/0x80 [sunrpc] [<ffffffffa00040d8>] xprt_destroy+0x78/0xa0 [sunrpc] [<ffffffffa0006241>] xprt_put+0x21/0x30 [sunrpc] [<ffffffffa00030cf>] rpc_free_client+0x10f/0x1a0 [sunrpc] [<ffffffffa0002ff3>] ? rpc_free_client+0x33/0x1a0 [sunrpc] [<ffffffffa0002f7e>] rpc_release_client+0x6e/0xb0 [sunrpc] [<ffffffffa000325d>] rpc_shutdown_client+0xfd/0x1b0 [sunrpc] [<ffffffffa0017196>] rpcb_put_local+0x106/0x130 [sunrpc] ... Acked-by: "Myklebust, Trond" <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-03-09 12:43:42 -05:00
Arnd Bergmann	dc893e19b5	Revert parts of "hlist: drop the node parameter from iterators" Commit `b67bfe0d42` ("hlist: drop the node parameter from iterators") did a lot of nice changes but also contains two small hunks that seem to have slipped in accidentally and have no apparent connection to the intent of the patch. This reverts the two extraneous changes. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Peter Senna Tschudin <peter.senna@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-03-08 15:05:34 -08:00
Hannes Frederic Sowa	ddf64354af	ipv6: stop multicast forwarding to process interface scoped addresses v2: a) used struct ipv6_addr_props v3: a) reverted changes for ipv6_addr_props v4: a) do not use __ipv6_addr_needs_scope_id Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-08 12:28:20 -05:00
Cristian Bercaru	3bc1b1add7	bridging: fix rx_handlers return code The frames for which rx_handlers return RX_HANDLER_CONSUMED are no longer counted as dropped. They are counted as successfully received by 'netif_receive_skb'. This allows network interface drivers to correctly update their RX-OK and RX-DRP counters based on the result of 'netif_receive_skb'. Signed-off-by: Cristian Bercaru <B43982@freescale.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-08 12:19:59 -05:00
Samuel Ortiz	3bbc0ceb7a	NFC: llcp: Report error to pending sockets when a device is removed Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-03-08 17:35:22 +01:00
Samuel Ortiz	e6a3a4bb85	NFC: llcp: Clean raw sockets from nfc_llcp_socket_release Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-03-08 17:34:57 +01:00
Paul Moore	a6a8fe950e	netlabel: fix build problems when CONFIG_IPV6=n My last patch to solve a problem where the static/fallback labels were not fully displayed resulted in build problems when IPv6 was disabled. This patch resolves the IPv6 build problems; sorry for the screw-up. Please queue for -stable or simply merge with the previous patch. Reported-by: Kbuild Test Robot <fengguang.wu@intel.com> Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-08 11:33:51 -05:00
Samuel Ortiz	3536da06db	NFC: llcp: Clean local timers and works when removing a device Whenever an adapter is removed we must clean all the local structures, especially the timers and scheduled work. Otherwise those asynchronous threads will eventually try to access the freed nfc_dev pointer if an LLCP link is up. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-03-08 14:25:04 +01:00
Samuel Ortiz	b141e811a0	NFC: llcp: Decrease socket ack log when accepting a connection This is really difficult to test with real NFC devices, but without this fix an LLCP server will eventually refuse new connections. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>	2013-03-08 14:25:04 +01:00
Chen Gang	2e85d67690	net/rds: zero last byte for strncpy for NUL terminated string, need be always sure '\0' in the end. additional info: strncpy will pads with zeroes to the end of the given buffer. should initialise every bit of memory that is going to be copied to userland Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-08 00:35:44 -05:00
Cong Wang	fbca58a224	bridge: add missing vid to br_mdb_get() Obviously, vid should be considered when searching for multicast group. Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Vlad Yasevich <vyasevich@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 16:32:19 -05:00
Christoph Paasch	c10cb5fc0f	Fix: sparse warning in inet_csk_prepare_forced_close In `e337e24d66` (inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sock) I introduced the function inet_csk_prepare_forced_close, which does a call to bh_unlock_sock(). This produces a sparse-warning. This patch adds the missing __releases. Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 16:31:29 -05:00
Junwei Zhang	d0d79c3fd7	afkey: fix a typo Signed-off-by: Martin Zhang <martinbj2008@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 16:26:45 -05:00
Paul Moore	0c1233aba1	netlabel: correctly list all the static label mappings When we have a large number of static label mappings that spill across the netlink message boundary we fail to properly save our state in the netlink_callback struct which causes us to repeat the same listings. This patch fixes this problem by saving the state correctly between calls to the NetLabel static label netlink "dumpit" routines. Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 16:20:23 -05:00
David S. Miller	43b18db8a2	Merge branch 'master' of git://1984.lsi.us.es/nf Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter fixes for your net tree, they are: * Don't generate audit log message if audit is not enabled, from Gao Feng. * Fix logging formatting for packets dropped by helpers, by Joe Perches. * Fix a compilation warning in nfnetlink if CONFIG_PROVE_RCU is not set, from Paul Bolle. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-07 15:20:02 -05:00
Johannes Berg	1345ee6a6d	cfg80211: fix potential BSS memory leak and update In the odd case that while updating information from a beacon, a BSS was found that is part of a hidden group, we drop the new information. In this case, however, we leak the IE buffer from the update, and erroneously update the entry's timestamp so it will never time out. Fix both these issues. Cc: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-07 12:55:32 +01:00
Vladimir Kondratiev	021fcdc13a	cfg80211: fix inconsistency in trace for rdev_set_mac_acl There is NETDEV_ENTRY that was incorrectly assigned as WIPHY_ASSIGN, fix it. Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-07 11:20:01 +01:00
Johannes Berg	27a737ff7c	mac80211: always synchronize_net() during station removal If there are keys left during station removal, then a synchronize_net() will be done (for each key, I have a patch to address this for 3.10), otherwise it won't be done at all which causes issues because the station could be used for TX while it's being removed from the driver -- that might confuse the driver. Fix this by always doing synchronize_net() if no key was present any more. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2013-03-06 23:17:08 +01:00
John W. Linville	32cdd592b7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem	2013-03-06 10:21:17 -05:00
J. Bruce Fields	3c34ae11fa	nfsd: fix krb5 handling of anonymous principals krb5 mounts started failing as of `683428fae8` "sunrpc: Update svcgss xdr handle to rpsec_contect cache". The problem is that mounts are usually done with some host principal which isn't normally mapped to any user, in which case svcgssd passes down uid -1, which the kernel is then expected to map to the export-specific anonymous uid or gid. The new uid_valid/gid_valid checks were therefore causing that downcall to fail. (Note the regression may not have been seen with older userspace that tended to map unknown principals to an anonymous id on their own rather than leaving it to the kernel.) Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2013-03-06 10:11:08 -05:00
David Ward	fa2b04f450	net/ipv4: Timestamp option cannot overflow with prespecified addresses When a router forwards a packet that contains the IPv4 timestamp option, if there is no space left in the option for the router to add its own timestamp, then the router increments the Overflow value in the option. However, if the addresses of the routers are prespecified in the option, then the overflow condition cannot happen: the option is structured so that each prespecified router has a place to write its timestamp. Other routers do not add a timestamp, so there will never be a lack of space. This fix ensures that the Overflow value in the IPv4 timestamp option is not incremented when the addresses of the routers are prespecified, even if the Pointer value is greater than the Length value. Signed-off-by: David Ward <david.ward@ll.mit.edu> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:06 -05:00
Eric Dumazet	d1f41b67ff	net: reduce net_rx_action() latency to 2 HZ We should use time_after_eq() to get maximum latency of two ticks, instead of three. Bug added in commit `24f8b2385` (net: increase receive packet quantum) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:06 -05:00
Randy Dunlap	691b3b7e13	net: fix new kernel-doc warnings in net core Fix new kernel-doc warnings in net/core/dev.c: Warning(net/core/dev.c:4788): No description found for parameter 'new_carrier' Warning(net/core/dev.c:4788): Excess function parameter 'new_carries' description in 'dev_change_carrier' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:06 -05:00
Paolo Valente	76e4cb0d3a	pkt_sched: sch_qfq: remove a useless invocation of qfq_update_eligible QFQ+ can select for service only 'eligible' aggregates, i.e., aggregates that would have started to be served also in the emulated ideal system. As a consequence, for QFQ+ to be work conserving, at least one of the active aggregates must be eligible when it is time to choose the next aggregate to serve. The set of eligible aggregates is updated through the function qfq_update_eligible(), which does guarantee that, after its invocation, at least one of the active aggregates is eligible. Because of this property, this function is invoked in qfq_deactivate_agg() to guarantee that at least one of the active aggregates is still eligible after an aggregate has been deactivated. In particular, the critical case is when there are other active aggregates, but the aggregate being deactivated happens to be the only one eligible. However, this precaution is not needed for QFQ+ to be work conserving, because update_eligible() is always invoked also at the beginning of qfq_choose_next_agg(). This patch removes the additional invocation of update_eligible() in qfq_deactivate_agg(). Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Paolo Valente	40dd2d5461	pkt_sched: sch_qfq: do not allow virtual time to jump if an aggregate is in service By definition of (the algorithm of) QFQ+, the system virtual time must be pushed up only if there is no 'eligible' aggregate, i.e. no aggregate that would have started to be served also in the ideal system emulated by QFQ+. QFQ+ serves only eligible aggregates, hence the aggregate currently in service is eligible. As a consequence, to decide whether there is no eligible aggregate, QFQ+ must also check whether there is no aggregate in service. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Paolo Valente	a0143efa96	pkt_sched: sch_qfq: prevent budget from wrapping around after a dequeue Aggregate budgets are computed so as to guarantee that, after an aggregate has been selected for service, that aggregate has enough budget to serve at least one maximum-size packet for the classes it contains. For this reason, after a new aggregate has been selected for service, its next packet is immediately dequeued, without any further control. The maximum packet size for a class, lmax, can be changed through qfq_change_class(). In case the user sets lmax to a lower value than the the size of some of the still-to-arrive packets, QFQ+ will automatically push up lmax as it enqueues these packets. This automatic push up is likely to happen with TSO/GSO. In any case, if lmax is assigned a lower value than the size of some of the packets already enqueued for the class, then the following problem may occur: the size of the next packet to dequeue for the class may happen to be larger than lmax, after the aggregate to which the class belongs has been just selected for service. In this case, even the budget of the aggregate, which is an unsigned value, may be lower than the size of the next packet to dequeue. After dequeueing this packet and subtracting its size from the budget, the latter would wrap around. This fix prevents the budget from wrapping around after any packet dequeue. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Paolo Valente	2f3b89a1fe	pkt_sched: sch_qfq: serve activated aggregates immediately if the scheduler is empty If no aggregate is in service, then the function qfq_dequeue() does not dequeue any packet. For this reason, to guarantee QFQ+ to be work conserving, a just-activated aggregate must be set as in service immediately if it happens to be the only active aggregate. This is done by the function qfq_enqueue(). Unfortunately, the function qfq_add_to_agg(), used to add a class to an aggregate, does not perform this important additional operation. In particular, if: 1) qfq_add_to_agg() is invoked to complete the move of a class from a source aggregate, becoming, for this move, inactive, to a destination aggregate, becoming instead active, and 2) the destination aggregate becomes the only active aggregate, then this aggregate is not however set as in service. QFQ+ remains then in a non-work-conserving state until a new invocation of qfq_enqueue() recovers the situation. This fix solves the problem by moving the logic for setting an aggregate as in service directly into the function qfq_activate_agg(). Hence, from whatever point qfq_activate_aggregate() is invoked, QFQ+ remains work conserving. Since the more-complex logic of this new version of activate_aggregate() is not necessary, in qfq_dequeue(), to reschedule an aggregate that finishes its budget, then the aggregate is now rescheduled by invoking directly the functions needed. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Paolo Valente	624b85fb96	pkt_sched: sch_qfq: fix the update of eligible-group sets Between two invocations of make_eligible, the system virtual time may happen to grow enough that, in its binary representation, a bit with higher order than 31 flips. This happens especially with TSO/GSO. Before this fix, the mask used in make_eligible was computed as (1UL<<index_of_last_flipped_bit)-1, whose value is well defined on a 64-bit architecture, because index_of_flipped_bit <= 63, but is in general undefined on a 32-bit architecture if index_of_flipped_bit > 31. The fix just replaces 1UL with 1ULL. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Paolo Valente	9b99b7e90b	pkt_sched: sch_qfq: properly cap timestamps in charge_actual_service QFQ+ schedules the active aggregates in a group using a bucket list (one list per group). The bucket in which each aggregate is inserted depends on the aggregate's timestamps, and the number of buckets in a group is enough to accomodate the possible (range of) values of the timestamps of all the aggregates in the group. For this property to hold, timestamps must however be computed correctly. One necessary condition for computing timestamps correctly is that the number of bits dequeued for each aggregate, while the aggregate is in service, does not exceed the maximum budget budgetmax assigned to the aggregate. For each aggregate, budgetmax is proportional to the number of classes in the aggregate. If the number of classes of the aggregate is decreased through qfq_change_class(), then budgetmax is decreased automatically as well. Problems may occur if the aggregate is in service when budgetmax is decreased, because the current remaining budget of the aggregate and/or the service already received by the aggregate may happen to be larger than the new value of budgetmax. In this case, when the aggregate is eventually deselected and its timestamps are updated, the aggregate may happen to have received an amount of service larger than budgetmax. This may cause the aggregate to be assigned a higher virtual finish time than the maximum acceptable value for the last bucket in the bucket list of the group. This fix introduces a cap that addresses this issue. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Reviewed-by: Fabio Checconi <fchecconi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Peter Hurley	f74861ca87	net/irda: Raise dtr in non-blocking open DTR/RTS need to be raised, regardless of the open() mode, but not if the port has already shutdown. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:05 -05:00
Peter Hurley	0b176ce3a7	net/irda: Use barrier to set task state Without a memory and compiler barrier, the task state change can migrate relative to the condition testing in a blocking loop. However, the task state change must be visible across all cpus prior to testing those conditions. Failing to do this can result in the familiar 'lost wakeup' and this task will hang until killed. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:04 -05:00
Peter Hurley	2f7c069b96	net/irda: Hold port lock while bumping blocked_open Although tty_lock() already protects concurrent update to blocked_open, that fails to meet the separation-of-concerns between tty_port and tty. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:04 -05:00
Peter Hurley	a4ed2e737c	net/irda: Fix port open counts Saving the port count bump is unsafe. If the tty is hung up while this open was blocking, the port count is zeroed. Explicitly check if the tty was hung up while blocking, and correct the port count if not. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-03-06 02:47:04 -05:00
Linus Torvalds	9da060d0ed	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "A moderately sized pile of fixes, some specifically for merge window introduced regressions although others are for longer standing items and have been queued up for -stable. I'm kind of tired of all the RDS protocol bugs over the years, to be honest, it's way out of proportion to the number of people who actually use it. 1) Fix missing range initialization in netfilter IPSET, from Jozsef Kadlecsik. 2) ieee80211_local->tim_lock needs to use BH disabling, from Johannes Berg. 3) Fix DMA syncing in SFC driver, from Ben Hutchings. 4) Fix regression in BOND device MAC address setting, from Jiri Pirko. 5) Missing usb_free_urb in ISDN Hisax driver, from Marina Makienko. 6) Fix UDP checksumming in bnx2x driver for 57710 and 57711 chips, fix from Dmitry Kravkov. 7) Missing cfgspace_lock initialization in BCMA driver. 8) Validate parameter size for SCTP assoc stats getsockopt(), from Guenter Roeck. 9) Fix SCTP association hangs, from Lee A Roberts. 10) Fix jumbo frame handling in r8169, from Francois Romieu. 11) Fix phy_device memory leak, from Petr Malat. 12) Omit trailing FCS from frames received in BGMAC driver, from Hauke Mehrtens. 13) Missing socket refcount release in L2TP, from Guillaume Nault. 14) sctp_endpoint_init should respect passed in gfp_t, rather than use GFP_KERNEL unconditionally. From Dan Carpenter. 15) Add AISX AX88179 USB driver, from Freddy Xin. 16) Remove MAINTAINERS entries for drivers deleted during the merge window, from Cesar Eduardo Barros. 17) RDS protocol can try to allocate huge amounts of memory, check that the user's request length makes sense, from Cong Wang. 18) SCTP should use the provided KMALLOC_MAX_SIZE instead of it's own, bogus, definition. From Cong Wang. 19) Fix deadlocks in FEC driver by moving TX reclaim into NAPI poll, from Frank Li. Also, fix a build error introduced in the merge window. 20) Fix bogus purging of default routes in ipv6, from Lorenzo Colitti. 21) Don't double count RTT measurements when we leave the TCP receive fast path, from Neal Cardwell." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits) tcp: fix double-counted receiver RTT when leaving receiver fast path CAIF: fix sparse warning for caif_usb rds: simplify a warning message net: fec: fix build error in no MXC platform net: ipv6: Don't purge default router if accept_ra=2 net: fec: put tx to napi poll function to fix dead lock sctp: use KMALLOC_MAX_SIZE instead of its own MAX_KMALLOC_SIZE rds: limit the size allocated by rds_message_alloc() MAINTAINERS: remove eexpress MAINTAINERS: remove drivers/net/wan/cycx* MAINTAINERS: remove 3c505 caif_dev: fix sparse warnings for caif_flow_cb ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver sctp: use the passed in gfp flags instead GFP_KERNEL ipv[4\|6]: correct dropwatch false positive in local_deliver_finish l2tp: Restore socket refcount when sendmsg succeeds net/phy: micrel: Disable asymmetric pause for KSZ9021 bgmac: omit the fcs phy: Fix phy_device_free memory leak bnx2x: Fix KR2 work-around condition ...	2013-03-05 18:42:29 -08:00

1 2 3 4 5 ...

27369 Commits