linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-08 13:11:45 +00:00

Author	SHA1	Message	Date
Li RongQing	ce9d9b8e5c	net: sysctl: fix a kmemleak warning the returned buffer of register_sysctl() is stored into net_header variable, but net_header is not used after, and compiler maybe optimise the variable out, and lead kmemleak reported the below warning comm "swapper/0", pid 1, jiffies 4294937448 (age 267.270s) hex dump (first 32 bytes): 90 38 8b 01 c0 ff ff ff 00 00 00 00 01 00 00 00 .8.............. 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffc00020f134>] create_object+0x10c/0x2a0 [<ffffffc00070ff44>] kmemleak_alloc+0x54/0xa0 [<ffffffc0001fe378>] __kmalloc+0x1f8/0x4f8 [<ffffffc00028e984>] __register_sysctl_table+0x64/0x5a0 [<ffffffc00028eef0>] register_sysctl+0x30/0x40 [<ffffffc00099c304>] net_sysctl_init+0x20/0x58 [<ffffffc000994dd8>] sock_init+0x10/0xb0 [<ffffffc0000842e0>] do_one_initcall+0x90/0x1b8 [<ffffffc000966bac>] kernel_init_freeable+0x218/0x2f0 [<ffffffc00070ed6c>] kernel_init+0x1c/0xe8 [<ffffffc000083bfc>] ret_from_fork+0xc/0x50 [<ffffffffffffffff>] 0xffffffffffffffff <<end check kmemleak>> Before fix, the objdump result on ARM64: 0000000000000000 <net_sysctl_init>: 0: a9be7bfd stp x29, x30, [sp,#-32]! 4: 90000001 adrp x1, 0 <net_sysctl_init> 8: 90000000 adrp x0, 0 <net_sysctl_init> c: 910003fd mov x29, sp 10: 91000021 add x1, x1, #0x0 14: 91000000 add x0, x0, #0x0 18: a90153f3 stp x19, x20, [sp,#16] 1c: 12800174 mov w20, #0xfffffff4 // #-12 20: 94000000 bl 0 <register_sysctl> 24: b4000120 cbz x0, 48 <net_sysctl_init+0x48> 28: 90000013 adrp x19, 0 <net_sysctl_init> 2c: 91000273 add x19, x19, #0x0 30: 9101a260 add x0, x19, #0x68 34: 94000000 bl 0 <register_pernet_subsys> 38: 2a0003f4 mov w20, w0 3c: 35000060 cbnz w0, 48 <net_sysctl_init+0x48> 40: aa1303e0 mov x0, x19 44: 94000000 bl 0 <register_sysctl_root> 48: 2a1403e0 mov w0, w20 4c: a94153f3 ldp x19, x20, [sp,#16] 50: a8c27bfd ldp x29, x30, [sp],#32 54: d65f03c0 ret After: 0000000000000000 <net_sysctl_init>: 0: a9bd7bfd stp x29, x30, [sp,#-48]! 4: 90000000 adrp x0, 0 <net_sysctl_init> 8: 910003fd mov x29, sp c: a90153f3 stp x19, x20, [sp,#16] 10: 90000013 adrp x19, 0 <net_sysctl_init> 14: 91000000 add x0, x0, #0x0 18: 91000273 add x19, x19, #0x0 1c: f90013f5 str x21, [sp,#32] 20: aa1303e1 mov x1, x19 24: 12800175 mov w21, #0xfffffff4 // #-12 28: 94000000 bl 0 <register_sysctl> 2c: f9002260 str x0, [x19,#64] 30: b40001a0 cbz x0, 64 <net_sysctl_init+0x64> 34: 90000014 adrp x20, 0 <net_sysctl_init> 38: 91000294 add x20, x20, #0x0 3c: 9101a280 add x0, x20, #0x68 40: 94000000 bl 0 <register_pernet_subsys> 44: 2a0003f5 mov w21, w0 48: 35000080 cbnz w0, 58 <net_sysctl_init+0x58> 4c: aa1403e0 mov x0, x20 50: 94000000 bl 0 <register_sysctl_root> 54: 14000004 b 64 <net_sysctl_init+0x64> 58: f9402260 ldr x0, [x19,#64] 5c: 94000000 bl 0 <unregister_sysctl_table> 60: f900227f str xzr, [x19,#64] 64: 2a1503e0 mov w0, w21 68: f94013f5 ldr x21, [sp,#32] 6c: a94153f3 ldp x19, x20, [sp,#16] 70: a8c37bfd ldp x29, x30, [sp],#48 74: d65f03c0 ret Add the possible error handle to free the net_header to remove the kmemleak warning Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 06:22:08 -07:00
Guillaume Nault	1acea4f6ce	ppp: fix pppoe_dev deletion condition in pppoe_release() We can't rely on PPPOX_ZOMBIE to decide whether to clear po->pppoe_dev. PPPOX_ZOMBIE can be set by pppoe_disc_rcv() even when po->pppoe_dev is NULL. So we have no guarantee that (sk->sk_state & PPPOX_ZOMBIE) implies (po->pppoe_dev != NULL). Since we're releasing a PPPoE socket, we want to release the pppoe_dev if it exists and reset sk_state to PPPOX_DEAD, no matter the previous value of sk_state. So we can just check for po->pppoe_dev and avoid any assumption on sk->sk_state. Fixes: `2b018d57ff` ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 03:30:01 -07:00
Li RongQing	f6b8dec998	af_key: fix two typos Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 03:05:19 -07:00
Lendacky, Thomas	20a41fba67	amd-xgbe: Use wmb before updating current descriptor count The code currently uses the lightweight dma_wmb barrier before updating the current descriptor count. Under heavy load, the Tx cleanup routine was seeing the updated current descriptor count before the updated descriptor information. As a result, the Tx descriptor was being cleaned up before it was used because it was not "owned" by the hardware yet, resulting in a Tx queue hang. Using the wmb barrier insures that the descriptor is updated before the descriptor counter preventing the Tx queue hang. For extra insurance, the Tx cleanup routine is changed to grab the current decriptor count on entry and uses that initial value in the processing loop rather than trying to chase the current value. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 02:59:04 -07:00
Nathan Sullivan	d2fd719bcb	net/phy: micrel: Add workaround for bad autoneg Very rarely, the KSZ9031 will appear to complete autonegotiation, but will drop all traffic afterwards. When this happens, the idle error count will read 0xFF after autonegotiation completes. Reset the PHY when in that state. Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 02:57:26 -07:00
David S. Miller	ec3661b422	Merge branch 'ipv6-overflow-arith' Hannes Frederic Sowa says: ==================== overflow-arith: begin to add support for overflow builtins functions I add a new header, linux/overflow-arith.h, as the central place to add overflow and wrap-around checking functions. The reason I am doing so is that it can make use of compiler supported builtin functions which can leverage hardware. As I need this for a fix in the ipv6 stack, which is also included in this series, I propose to add it sooner than later over Davem's net tree. This is also the reason why I start slowly with only the one function I need at this time. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 02:49:41 -07:00
Hannes Frederic Sowa	b72a2b01b6	ipv6: protect mtu calculation of wrap-around and infinite loop by rounding issues Raw sockets with hdrincl enabled can insert ipv6 extension headers right into the data stream. In case we need to fragment those packets, we reparse the options header to find the place where we can insert the fragment header. If the extension headers exceed the link's MTU we actually cannot make progress in such a case. Instead of ending up in broken arithmetic or rounding towards 0 and entering an endless loop in ip6_fragment, just prevent those cases by aborting early and signal -EMSGSIZE to user space. Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 02:49:36 -07:00
Hannes Frederic Sowa	79907146fb	overflow-arith: begin to add support for overflow builtin functions The idea of the overflow-arith.h header is to collect overflow checking functions in one central place. If gcc compiler supports the __builtin_overflow_* builtins we use them because they might give better performance, otherwise the code falls back to normal overflow checking functions. The builtin_overflow functions are supported by gcc-5 and clang. The matter of supporting clang is to just provide a corresponding CC_HAVE_BUILTIN_OVERFLOW, because the specific overflow checking builtins don't differ between gcc and clang. I just provide overflow_usub function here as I intend this to get merged into net, more functions will definitely follow as they are needed. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 02:49:35 -07:00
Andrew Shewmaker	c80dbe0461	tcp: allow dctcp alpha to drop to zero If alpha is strictly reduced by alpha >> dctcp_shift_g and if alpha is less than 1 << dctcp_shift_g, then alpha may never reach zero. For example, given shift_g=4 and alpha=15, alpha >> dctcp_shift_g yields 0 and alpha remains 15. The effect isn't noticeable in this case below cwnd=137, but could gradually drive uncongested flows with leftover alpha down to cwnd=137. A larger dctcp_shift_g would have a greater effect. This change causes alpha=15 to drop to 0 instead of being decrementing by 1 as it would when alpha=16. However, it requires one less conditional to implement since it doesn't have to guard against subtracting 1 from 0U. A decay of 15 is not unreasonable since an equal or greater amount occurs at alpha >= 240. Signed-off-by: Andrew G. Shewmaker <agshew@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 02:46:52 -07:00
lucien	ab997ad408	ipv6: fix the incorrect return value of throw route The error condition -EAGAIN, which is signaled by throw routes, tells the rules framework to walk on searching for next matches. If the walk ends and we stop walking the rules with the result of a throw route we have to translate the error conditions to -ENETUNREACH. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 02:38:18 -07:00
Jason Wang	f23d538bc2	macvtap: unbreak receiving of gro skb with frag list We don't have fraglist support in TAP_FEATURES. This will lead software segmentation of gro skb with frag list. Fixes by having frag list support in TAP_FEATURES. With this patch single session of netperf receiving were restored from about 5Gb/s to about 12Gb/s on mlx4. Fixes `a567dd6252` ("macvtap: simplify usage of tap_features") Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-23 02:34:39 -07:00
Pravin B Shelar	fc4099f172	openvswitch: Fix egress tunnel info. While transitioning to netdev based vport we broke OVS feature which allows user to retrieve tunnel packet egress information for lwtunnel devices. Following patch fixes it by introducing ndo operation to get the tunnel egress info. Same ndo operation can be used for lwtunnel devices and compat ovs-tnl-vport devices. So after adding such device operation we can remove similar operation from ovs-vport. Fixes: `614732eaa1` ("openvswitch: Use regular VXLAN net_device device"). Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 19:39:25 -07:00
David S. Miller	0c472b9b39	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2015-10-22 This series contains fixes to i40e only. Jesse provides two small fixes for i40e, first fixes counters that were being displayed incorrectly due to indexing beyond the array of strings when printing stats. Then fixed the fact that the driver was printing a message about not being able to assign VMDq because a lack of MSI-X vectors, when it was not true. It was due to a line missing that initialized a variable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 19:02:08 -07:00
Jorgen Hansen	8566b86ab9	VSOCK: Fix lockdep issue. The recent fix for the vsock sock_put issue used the wrong initializer for the transport spin_lock causing an issue when running with lockdep checking. Testing: Verified fix on kernel with lockdep enabled. Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 18:26:29 -07:00
Jesse Brandeburg	e9e53662d8	i40e: fix annoying message The driver was printing a message about not being able to assign VMDq because of a lack of MSI-X vectors. This was because a line was missing that initialized a variable, simply a merge error. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-22 18:06:18 -07:00
Jesse Brandeburg	74a6c66565	i40e: fix stats offsets The code was setting up stats that were not being initialized. This caused several counters to be displayed incorrectly, due to indexing beyond the array of strings when printing stats. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2015-10-22 17:48:21 -07:00
Bjørn Mork	0db65fcfcd	qmi_wwan: add Sierra Wireless MC74xx/EM74xx New device IDs shamelessly lifted from the vendor driver. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 07:56:21 -07:00
David S. Miller	199c655069	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2015-10-22 1) Fix IPsec pre-encap fragmentation for GSO packets. From Herbert Xu. 2) Fix some header checks in _decode_session6. We skip the header informations if the data pointer points already behind the header in question for some protocols. This is because we call pskb_may_pull with a negative value converted to unsigened int from pskb_may_pull in this case. Skipping the header informations can lead to incorrect policy lookups. From Mathias Krause. 3) Allow to change the replay threshold and expiry timer of a state without having to set other attributes like replay counter and byte lifetime. Changing these other attributes may break the SA. From Michael Rossberg. 4) Fix pmtu discovery for local generated packets. We may fail dispatch to the inner address family. As a reault, the local error handler is not called and the mtu value is not reported back to userspace. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 07:46:05 -07:00
David Ahern	d46a9d678e	net: ipv6: Dont add RT6_LOOKUP_F_IFACE flag if saddr set `741a11d9e4` ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set") adds the RT6_LOOKUP_F_IFACE flag to make device index mismatch fatal if oif is given. Hajime reported that this change breaks the Mobile IPv6 use case that wants to force the message through one interface yet use the source address from another interface. Handle this case by only adding the flag if oif is set and saddr is not set. Fixes: `741a11d9e4` ("net: ipv6: Add RT6_LOOKUP_F_IFACE flag if oif is set") Cc: Hajime Tazaki <thehajime@gmail.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 07:36:19 -07:00
David S. Miller	92a93fd5bb	Merge branch 'isdn-null-deref' Karsten Keil says: ==================== Fix potential NULL pointer access and memory leak in ISDN layer2 functions Insu Yun did brinup the issue with not checking the skb_clone() return value in the layer2 I-frame ull functions. This series fix the issue in a way which avoid protocol violations/data loss on a temporary memory shortage. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 07:23:25 -07:00
Karsten Keil	c96356a9ba	mISDN: fix OOM condition for sending queued I-Frames The old code did not check the return value of skb_clone(). The extra skb_clone() is not needed at all, if using skb_realloc_headroom() instead, which gives us a private copy with enough headroom as well. We need to requeue the original skb if the call failed, because we cannot inform upper layers about the data loss. Restructure the code to minimise rollback effort if it happens. This fix kernel bug #86091 Thanks to Insu Yun <wuninsu@gmail.com> to remind me on this issue. Signed-off-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 07:23:19 -07:00
Karsten Keil	c7a7c95e8e	ISDN: fix OOM condition for sending queued I-Frames The skb_clone() return value was not checked and the skb_realloc_headroom() usage was wrong, the old skb was not freed. It turned out, that the skb_clone is not needed at all, the skb_realloc_headroom() will create a private copy with enough headroom and the original SKB can be used for the ACK queue. We need to requeue the original skb if the call failed, since the upper layer cannot be informed about memory shortage. Thanks to Insu Yun <wuninsu@gmail.com> to remind me on this issue. Signed-off-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 07:23:19 -07:00
Jorgen Hansen	4ef7ea9195	VSOCK: sock_put wasn't safe to call in interrupt context In the vsock vmci_transport driver, sock_put wasn't safe to call in interrupt context, since that may call the vsock destructor which in turn calls several functions that should only be called from process context. This change defers the callling of these functions to a worker thread. All these functions were deallocation of resources related to the transport itself. Furthermore, an unused callback was removed to simplify the cleanup. Multiple customers have been hitting this issue when using VMware tools on vSphere 2015. Also added a version to the vmci transport module (starting from 1.0.2.0-k since up until now it appears that this module was sharing version with vsock that is currently at 1.0.1.0-k). Reviewed-by: Aditya Asarwade <asarwade@vmware.com> Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 07:21:05 -07:00
David Herrmann	47191d65b6	netlink: fix locking around NETLINK_LIST_MEMBERSHIPS Currently, NETLINK_LIST_MEMBERSHIPS grabs the netlink table while copying the membership state to user-space. However, grabing the netlink table is effectively a write_lock_irq(), and as such we should not be triggering page-faults in the critical section. This can be easily reproduced by the following snippet: int s = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE); void p = mmap(0, 4096, PROT_READ\|PROT_WRITE, MAP_PRIVATE\|MAP_ANON, -1, 0); int r = getsockopt(s, 0x10e, 9, p, (void)((char*)p + 4092)); This should work just fine, but currently triggers EFAULT and a possible WARN_ON below handle_mm_fault(). Fix this by reducing locking of NETLINK_LIST_MEMBERSHIPS to a read-side lock. The write-lock was overkill in the first place, and the read-lock allows page-faults just fine. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 07:18:28 -07:00
Andrew F. Davis	34e45ad937	net: phy: dp83848: Add TI DP83848 Ethernet PHY Add support for the TI DP83848 Ethernet PHY device. The DP83848 is a highly reliable, feature rich, IEEE 802.3 compliant single port 10/100 Mb/s Ethernet Physical Layer Transceiver supporting the MII and RMII interfaces. Signed-off-by: Andrew F. Davis <afd@ti.com> Signed-off-by: Dan Murphy <dmurphy@ti.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Dan Murphy <dmurphy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-22 06:37:19 -07:00
Hans de Goede	104eb270e6	net: sun4i-emac: Properly free resources on probe failure and remove Fix sun4i-emac not releasing the following resources: -iomapped memory not released on probe-failure nor on remove -clock not getting disabled on probe-failure nor on remove -sram not being released on remove And while at it also add error checking to the clk_prepare_enable call done on probe. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:47:45 -07:00
Joe Stringer	e754ec69ab	openvswitch: Serialize nested ct actions if provided If userspace provides a ct action with no nested mark or label, then the storage for these fields is zeroed. Later when actions are requested, such zeroed fields are serialized even though userspace didn't originally specify them. Fix the behaviour by ensuring that no action is serialized in this case, and reject actions where userspace attempts to set these fields with mask=0. This should make netlink marshalling consistent across deserialization/reserialization. Reported-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:33:43 -07:00
Joe Stringer	4f0909ee3d	openvswitch: Mark connections new when not confirmed. New, related connections are marked as such as part of ovs_ct_lookup(), but they are not marked as "new" if the commit flag is used. Make this consistent by setting the "new" flag whenever !nf_ct_is_confirmed(ct). Reported-by: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:33:40 -07:00
Joe Stringer	1d008a1df9	openvswitch: Clarify conntrack COMMIT behaviour The presence of this attribute does not modify the ct_state for the current packet, only future packets. Make this more clear in the header definition. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:33:38 -07:00
Joe Stringer	9e384715e9	openvswitch: Reject ct_state masks for unknown bits Currently, 0-bits are generated in ct_state where the bit position is undefined, and matches are accepted on these bit-positions. If userspace requests to match the 0-value for this bit then it may expect only a subset of traffic to match this value, whereas currently all packets will have this bit set to 0. Fix this by rejecting such masks. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:33:36 -07:00
Renato Westphal	e2e8009ff7	tcp: remove improper preemption check in tcp_xmit_probe_skb() Commit `e520af48c7` introduced the following bug when setting the TCP_REPAIR sockoption: [ 2860.657036] BUG: using __this_cpu_add() in preemptible [00000000] code: daemon/12164 [ 2860.657045] caller is __this_cpu_preempt_check+0x13/0x20 [ 2860.657049] CPU: 1 PID: 12164 Comm: daemon Not tainted 4.2.3 #1 [ 2860.657051] Hardware name: Dell Inc. PowerEdge R210 II/0JP7TR, BIOS 2.0.5 03/13/2012 [ 2860.657054] ffffffff81c7f071 ffff880231e9fdf8 ffffffff8185d765 0000000000000002 [ 2860.657058] 0000000000000001 ffff880231e9fe28 ffffffff8146ed91 ffff880231e9fe18 [ 2860.657062] ffffffff81cd1a5d ffff88023534f200 ffff8800b9811000 ffff880231e9fe38 [ 2860.657065] Call Trace: [ 2860.657072] [<ffffffff8185d765>] dump_stack+0x4f/0x7b [ 2860.657075] [<ffffffff8146ed91>] check_preemption_disabled+0xe1/0xf0 [ 2860.657078] [<ffffffff8146edd3>] __this_cpu_preempt_check+0x13/0x20 [ 2860.657082] [<ffffffff817e0bc7>] tcp_xmit_probe_skb+0xc7/0x100 [ 2860.657085] [<ffffffff817e1e2d>] tcp_send_window_probe+0x2d/0x30 [ 2860.657089] [<ffffffff817d1d8c>] do_tcp_setsockopt.isra.29+0x74c/0x830 [ 2860.657093] [<ffffffff817d1e9c>] tcp_setsockopt+0x2c/0x30 [ 2860.657097] [<ffffffff81767b74>] sock_common_setsockopt+0x14/0x20 [ 2860.657100] [<ffffffff817669e1>] SyS_setsockopt+0x71/0xc0 [ 2860.657104] [<ffffffff81865172>] entry_SYSCALL_64_fastpath+0x16/0x75 Since tcp_xmit_probe_skb() can be called from process context, use NET_INC_STATS() instead of NET_INC_STATS_BH(). Fixes: `e520af48c7` ("tcp: add TCPWinProbe and TCPKeepAlive SNMP counters") Signed-off-by: Renato Westphal <renatow@taghos.com.br> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:29:26 -07:00
David S. Miller	36a28b2116	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains four Netfilter fixes for net, they are: 1) Fix Kconfig dependencies of new nf_dup_ipv4 and nf_dup_ipv6. 2) Remove bogus test nh_scope in IPv4 rpfilter match that is breaking --accept-local, from Xin Long. 3) Wait for RCU grace period after dropping the pending packets in the nfqueue, from Florian Westphal. 4) Fix sleeping allocation while holding spin_lock_bh, from Nikolay Borisov. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:26:17 -07:00
Jon Paul Maloy	e53567948f	tipc: conditionally expand buffer headroom over udp tunnel In commit `d999297c3d` ("tipc: reduce locking scope during packet reception") we altered the packet retransmission function. Since then, when restransmitting packets, we create a clone of the original buffer using __pskb_copy(skb, MIN_H_SIZE), where MIN_H_SIZE is the size of the area we want to have copied, but also the smallest possible TIPC packet size. The value of MIN_H_SIZE is 24. Unfortunately, __pskb_copy() also has the effect that the headroom of the cloned buffer takes the size MIN_H_SIZE. This is too small for carrying the packet over the UDP tunnel bearer, which requires a minimum headroom of 28 bytes. A change to just use pskb_copy() lets the clone inherit the original headroom of 80 bytes, but also assumes that the copied data area is of at least that size, something that is not always the case. So that is not a viable solution. We now fix this by adding a check for sufficient headroom in the transmit function of udp_media.c, and expanding it when necessary. Fixes: commit `d999297c3d` ("tipc: reduce locking scope during packet reception") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:13:48 -07:00
Andreas Schwab	7a4264a925	net: cavium: change NET_VENDOR_CAVIUM to bool CONFIG_NET_VENDOR_CAVIUM is only used to hide/show config options and to include subdirectories in the build, so it doesn't make sense to make it tristate. Signed-off-by: Andreas Schwab <schwab@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:12:16 -07:00
Jon Paul Maloy	45c8b7b175	tipc: allow non-linear first fragment buffer The current code for message reassembly is erroneously assuming that the the first arriving fragment buffer always is linear, and then goes ahead resetting the fragment list of that buffer in anticipation of more arriving fragments. However, if the buffer already happens to be non-linear, we will inadvertently drop the already attached fragment list, and later on trig a BUG() in __pskb_pull_tail(). We see this happen when running fragmented TIPC multicast across UDP, something made possible since commit `d0f91938be` ("tipc: add ip/udp media type") We fix this by not resetting the fragment list when the buffer is non- linear, and by initiatlizing our private fragment list tail pointer to the tail of the existing fragment list. Fixes: commit `d0f91938be` ("tipc: add ip/udp media type") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:08:24 -07:00
James Morse	1241365f1a	openvswitch: Allocate memory for ovs internal device stats. "openvswitch: Remove vport stats" removed the per-vport statistics, in order to use the netdev's statistics fields. "openvswitch: Fix ovs_vport_get_stats()" fixed the export of these stats to user-space, by using the provided netdev_ops to collate them - but ovs internal devices still use an unallocated dev->tstats field to count packets, which are no longer exported by this api. Allocate the dev->tstats field for ovs internal devices, and wire up ndo_get_stats64 with the original implementation of ovs_vport_get_stats(). On its own, "openvswitch: Fix ovs_vport_get_stats()" fixes the OOPs, unmasking a full-on panic on arm64: =============%<============== [<ffffffbffc00ce4c>] internal_dev_recv+0xa8/0x170 [openvswitch] [<ffffffbffc0008b4>] do_output.isra.31+0x60/0x19c [openvswitch] [<ffffffbffc000bf8>] do_execute_actions+0x208/0x11c0 [openvswitch] [<ffffffbffc001c78>] ovs_execute_actions+0xc8/0x238 [openvswitch] [<ffffffbffc003dfc>] ovs_packet_cmd_execute+0x21c/0x288 [openvswitch] [<ffffffc0005e8c5c>] genl_family_rcv_msg+0x1b0/0x310 [<ffffffc0005e8e60>] genl_rcv_msg+0xa4/0xe4 [<ffffffc0005e7ddc>] netlink_rcv_skb+0xb0/0xdc [<ffffffc0005e8a94>] genl_rcv+0x38/0x50 [<ffffffc0005e76c0>] netlink_unicast+0x164/0x210 [<ffffffc0005e7b70>] netlink_sendmsg+0x304/0x368 [<ffffffc0005a21c0>] sock_sendmsg+0x30/0x4c [SNIP] Kernel panic - not syncing: Fatal exception in interrupt =============%<============== Fixes: `8c876639c9` ("openvswitch: Remove vport stats.") Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:06:36 -07:00
David Ahern	f1900fb5ec	net: Really fix vti6 with oif in dst lookups `6e28b00082` ("net: Fix vti use case with oif in dst lookups for IPv6") is missing the checks on FLOWI_FLAG_SKIP_NH_OIF. Add them. Fixes: `42a7b32b73` ("xfrm: Add oif to dst lookups") Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:04:54 -07:00
Jon Paul Maloy	53387c4e22	tipc: extend broadcast link window size The default fix broadcast window size is currently set to 20 packets. This is a very low value, set at a time when we were still testing on 10 Mb/s hubs, and a change to it is long overdue. Commit `7845989cb4` ("net: tipc: fix stall during bclink wakeup procedure") revealed a problem with this low value. For messages of importance LOW, the backlog queue limit will be calculated to 30 packets, while a single, maximum sized message of 66000 bytes, carried across a 1500 MTU network consists of 46 packets. This leads to the following scenario (among others leading to the same situation): 1: Msg 1 of 46 packets is sent. 20 packets go to the transmit queue, 26 packets to the backlog queue. 2: Msg 2 of 46 packets is attempted sent, but rejected because there is no more space in the backlog queue at this level. The sender is added to the wakeup queue with a "pending packets chain size" number of 46. 3: Some packets in the transmit queue are acked and released. We try to wake up the sender, but the pending size of 46 is bigger than the LOW wakeup limit of 30, so this doesn't happen. 5: Subsequent acks releases all the remaining buffers. Each time we test for the wakeup criteria and find that 46 still is larger than 30, even after both the transmit and the backlog queues are empty. 6: The sender is never woken up and given a chance to send its message. He is stuck. We could now loosen the wakeup criteria (used by link_prepare_wakeup()) to become equal to the send criteria (used by tipc_link_xmit()), i.e., by ignoring the "pending packets chain size" value altogether, or we can just increase the queue limits so that the criteria can be satisfied anyway. There are good reasons (potentially multiple waiting senders) to not opt for the former solution, so we choose the latter one. This commit fixes the problem by giving the broadcast link window a default value of 50 packets. We also introduce a new minimum link window size BCLINK_MIN_WIN of 32, which is enough to always avoid the described situation. Finally, in order to not break any existing users which may set the window explicitly, we enforce that the window is set to the new minimum value in case the user is trying to set it to anything lower. Fixes: `7845989cb4` ("net: tipc: fix stall during bclink wakeup procedure") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 19:02:17 -07:00
Dan Carpenter	50010c2059	irda: precedence bug in irlmp_seq_hb_idx() This is decrementing the pointer, instead of the value stored in the pointer. KASan detects it as an out of bounds reference. Reported-by: "Berry Cheng 程君(成淼)" <chengmiao.cj@alibaba-inc.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 07:48:26 -07:00
Joe Jin	ca88ea1247	xen-netfront: update num_queues to real created Sometimes xennet_create_queues() may failed to created all requested queues, we need to update num_queues to real created to avoid NULL pointer dereference. Signed-off-by: Joe Jin <joe.jin@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: David S. Miller <davem@davemloft.net> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 07:45:39 -07:00
Gao feng	f6a835bb04	vsock: fix missing cleanup when misc_register failed reset transport and unlock if misc_register failed. Signed-off-by: Gao feng <omarapazanadi@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 07:41:06 -07:00
David S. Miller	10be15ff6c	Merge branch 'mv643xx-fixes' Philipp Kirchhofer says: ==================== net: mv643xx_eth: TSO TX data corruption fixes as previously discussed [1] the mv643xx_eth driver has some issues with data corruption when using TCP segmentation offload (TSO). The following patch set improves this situation by fixing two data corruption bugs in the TSO TX path. Before applying the patches repeatedly accessing large files located on a SMB share on my NSA325 NAS with TSO enabled resulted in different hash sums, which confirmed that data corruption is happening during file transfer. After applying the patches the hash sums were the same. As this is my first patch submission please feel free to point out any issues with the patch set. [1] http://thread.gmane.org/gmane.linux.network/336530 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 07:36:51 -07:00
Philipp Kirchhofer	968200f322	net: mv643xx_eth: Defer writing the first TX descriptor when using TSO To prevent a race between the TX DMA engine and the CPU the writing of the first transmit descriptor must be deferred until all following descriptors have been updated. The network card may otherwise start transmitting before all packet descriptors are set up correctly, which leads to data corruption or an aborted transmit operation. This deferral is already done in the non-TSO TX path, implement it also in the TSO TX path. Signed-off-by: Philipp Kirchhofer <philipp@familie-kirchhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 07:36:41 -07:00
Philipp Kirchhofer	91986fd3d3	net: mv643xx_eth: Ensure proper data alignment in TSO TX path The TX DMA engine requires that buffers with a size of 8 bytes or smaller must be 64 bit aligned. This requirement may be violated when doing TSO, as in this case larger skb frags can be broken up and transmitted in small parts with then inappropriate alignment. Fix this by checking for proper alignment before handing a buffer to the DMA engine. If the data is misaligned realign it by copying it into the TSO header data area. Signed-off-by: Philipp Kirchhofer <philipp@familie-kirchhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 07:36:38 -07:00
David S. Miller	f3c9f95056	Merge branch 'smsc-energy-detect' Heiko Schocher says: ==================== net, phy, smsc: add posibility to disable energy detect mode On some boards the energy enable detect mode leads in trouble with some switches, so make the enabling of this mode configurable through DT. Therefore the property "smsc,disable-energy-detect" is introduced. Patch 1 introduces phy-handle support for the ti,cpsw driver. This is needed now for the smsc phy. Patch 2 adds the disable energy mode functionality to the smsc phy Changes in v2: - add comments from Florian Fainelli - I did not change disable property name into enable because I fear to break existing behaviour - add smsc vendor prefix - remove CONFIG_OF and use __maybe_unused - introduce "phy-handle" ability into ti,cpsw driver, so I can remove bogus: if (!of_node && dev->parent->of_node) of_node = dev->parent->of_node; construct. Therefore new patch for the ti,cpsw driver is necessary. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 06:41:51 -07:00
Heiko Schocher	d88ecb373b	net: phy: smsc: disable energy detect mode On some boards the energy enable detect mode leads in trouble with some switches, so make the enabling of this mode configurable through DT. Signed-off-by: Heiko Schocher <hs@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 06:41:44 -07:00
Heiko Schocher	9e42f71526	drivers: net: cpsw: add phy-handle parsing add the ability to parse "phy-handle". This is needed for phys, which have a DT node, and need to parse DT properties. Signed-off-by: Heiko Schocher <hs@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 06:41:42 -07:00
Simon Arlott	aebd99477f	bcm63xx_enet: check 1000BASE-T advertisement configuration If a gigabit ethernet PHY is connected to a fast ethernet MAC, then it can detect 1000 support from the partner but not use it. This results in a forced speed of 1000 and RX/TX failure. Check for 1000BASE-T support and then check the advertisement configuration before setting the MAC speed to 1000mbit. Signed-off-by: Simon Arlott <simon@fire.lp0.eu> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-10-21 06:36:38 -07:00
Linus Torvalds	1099f86044	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Account for extra headroom in ath9k driver, from Felix Fietkau. 2) Fix OOPS in pppoe driver due to incorrect socket state transition, from Guillaume Nault. 3) Kill memory leak in amd-xgbe debugfx, from Geliang Tang. 4) Power management fixes for iwlwifi, from Johannes Berg. 5) Fix races in reqsk_queue_unlink(), from Eric Dumazet. 6) Fix dst_entry usage in ARP replies, from Jiri Benc. 7) Cure OOPSes with SO_GET_FILTER, from Daniel Borkmann. 8) Missing allocation failure check in amd-xgbe, from Tom Lendacky. 9) Various resource allocation/freeing cures in DSA< from Neil Armstrong. 10) A series of bug fixes in the openvswitch conntrack support, from Joe Stringer. 11) Fix two cases (BPF and act_mirred) where we have to clean the sender cpu stored in the SKB before transmitting. From WANG Cong and Alexei Starovoitov. 12) Disable VLAN filtering in promiscuous mode in mlx5 driver, from Achiad Shochat. 13) Older bnx2x chips cannot do 4-tuple UDP hashing, so prevent this configuration via ethtool. From Yuval Mintz. 14) Don't call rt6_uncached_list_flush_dev() from rt6_ifdown() when 'dev' is NULL, from Eric Biederman. 15) Prevent stalled link synchronization in tipc, from Jon Paul Maloy. 16) kcalloc() gstrings ethtool buffer before having driver fill it in, in order to prevent kernel memory leaking. From Joe Perches. 17) Fix mixxing rt6_info initialization for blackhole routes, from Martin KaFai Lau. 18) Kill VLAN regression in via-rhine, from Andrej Ota. 19) Missing pfmemalloc check in sk_add_backlog(), from Eric Dumazet. 20) Fix spurious MSG_TRUNC signalling in netlink dumps, from Ronen Arad. 21) Scrube SKBs when pushing them between namespaces in openvswitch, from Joe Stringer. 22) bcmgenet enables link interrupts too early, fix from Florian Fainelli. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (92 commits) net: bcmgenet: Fix early link interrupt enabling tunnels: Don't require remote endpoint or ID during creation. openvswitch: Scrub skb between namespaces xen-netback: correctly check failed allocation net: asix: add support for the Billionton GUSB2AM-1G-B USB adapter netlink: Trim skb to alloc size to avoid MSG_TRUNC net: add pfmemalloc check in sk_add_backlog() via-rhine: fix VLAN receive handling regression. ipv6: Initialize rt6_info properly in ip6_blackhole_route() ipv6: Move common init code for rt6_info to a new function rt6_info_init() Bluetooth: Fix initializing conn_params in scan phase Bluetooth: Fix conn_params list update in hci_connect_le_scan_cleanup Bluetooth: Fix remove_device behavior for explicit connects Bluetooth: Fix LE reconnection logic Bluetooth: Fix reference counting for LE-scan based connections Bluetooth: Fix double scan updates mlxsw: core: Fix race condition in __mlxsw_emad_transmit tipc: move fragment importance field to new header position ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings tipc: eliminate risk of stalled link synchronization ...	2015-10-19 09:55:40 -07:00
Steffen Klassert	ca064bd893	xfrm: Fix pmtu discovery for local generated packets. Commit `044a832a77` ("xfrm: Fix local error reporting crash with interfamily tunnels") moved the setting of skb->protocol behind the last access of the inner mode family to fix an interfamily crash. Unfortunately now skb->protocol might not be set at all, so we fail dispatch to the inner address family. As a reault, the local error handler is not called and the mtu value is not reported back to userspace. We fix this by setting skb->protocol on message size errors before we call xfrm_local_error. Fixes: `044a832a77` ("xfrm: Fix local error reporting crash with interfamily tunnels") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2015-10-19 10:30:05 +02:00

1 2 3 4 5 ...

547993 Commits