linux

Author	SHA1	Message	Date
wenxiong@linux.vnet.ibm.com	0c0e63410a	bnx2x: Adapter not recovery from EEH error injection When injecting EEH error to bnx2x adapter, adapter couldn't be recovery and caused recursive EEH errors. The patch fixes the issue. Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-03 18:37:38 -07:00
Balakumaran Kannan	31f6f291b6	net: driver: smsc: set NOCARRIER flag in dev at driver initialization As smsc driver supports carrier detection, it should unset NOCARRIER flag only after carrier state determination. By default that flag is off so driver should set it before starting auto-negotiation Signed-off-by: Balakumaran <Balakumaran.Kannan@ap.sony.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-03 18:20:38 -07:00
Michal Kubecek	21ee543edc	xfrm: fix race between netns cleanup and state expire notification The xfrm_user module registers its pernet init/exit after xfrm itself so that its net exit function xfrm_user_net_exit() is executed before xfrm_net_exit() which calls xfrm_state_fini() to cleanup the SA's (xfrm states). This opens a window between zeroing net->xfrm.nlsk pointer and deleting all xfrm_state instances which may access it (via the timer). If an xfrm state expires in this window, xfrm_exp_state_notify() will pass null pointer as socket to nlmsg_multicast(). As the notifications are called inside rcu_read_lock() block, it is sufficient to retrieve the nlsk socket with rcu_dereference() and check the it for null. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-03 16:07:44 -07:00
David S. Miller	1299b3c49b	Merge branch 'cnic' Michael Chan says: ==================== cnic fixes Fix 2 sleeping function from invalid context bugs and 1 missing iscsi netlink message bug. v2: Fixed typo in rcu_dereference_protected() and tested with CONFIG_PROVE_RCU ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 23:17:03 -07:00
Michael Chan	5943691442	cnic: Fix missing ISCSI_KEVENT_IF_DOWN message The iSCSI netlink message needs to be sent before the ulp_ops is cleared as it is sent through a function pointer in the ulp_ops. This bug causes iscsid to not get the message when the bnx2i driver is unloaded. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 23:16:41 -07:00
Michael Chan	437b8a26f9	cnic: Don't take cnic_dev_lock in cnic_alloc_uio_rings() We are allocating memory with GFP_KERNEL under spinlock. Since this is the only call manipulating the cnic_udev_list and it is always under rtnl_lock, cnic_dev_lock can be safely removed. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 23:16:41 -07:00
Michael Chan	20f30c2d5e	cnic: Don't take rcu_read_lock in cnic_rcv_netevent() Because the called function, such as bnx2fc_indicate_netevent(), can sleep, we cannot take rcu_lock(). To prevent the rcu protected ulp_ops from going away, we use the cnic_lock mutex and set the ULP_F_CALL_PENDING flag. The code already waits for ULP_F_CALL_PENDING flag to clear in cnic_unregister_device(). Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 23:16:41 -07:00
Christian Riesch	74f43922dc	net: davinci_emac: Remove unwanted debug/error message In commit `cd11cf5053` I accidentally added an error message. I used it for debugging and forgot to remove it before submitting the patch. Signed-off-by: Christian Riesch <christian.riesch@omicron.at> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 23:13:38 -07:00
Linus Torvalds	cae61ba37b	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Unbreak zebra and other netlink apps, from Eric W Biederman. 2) Some new qmi_wwan device IDs, from Aleksander Morgado. 3) Fix info leak in DCB netlink handler of qlcnic driver, from Dan Carpenter. 4) inet_getid() and ipv6_select_ident() do not generate monotonically increasing ID numbers, fix from Eric Dumazet. 5) Fix memory leak in __sk_prepare_filter(), from Leon Yu. 6) Netlink leftover bytes warning message is user triggerable, rate limit it. From Michal Schmidt. 7) Fix non-linear SKB panic in ipvs, from Peter Christensen. 8) Congestion window undo needs to be performed even if only never retransmitted data is SACK'd, fix from Yuching Cheng. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (24 commits) net: filter: fix possible memory leak in __sk_prepare_filter() net: ec_bhf: Add runtime dependencies tcp: fix cwnd undo on DSACK in F-RTO netlink: Only check file credentials for implicit destinations ipheth: Add support for iPad 2 and iPad 3 team: fix mtu setting net: fix inet_getid() and ipv6_select_ident() bugs net: qmi_wwan: interface #11 in Sierra Wireless MC73xx is not QMI net: qmi_wwan: add additional Sierra Wireless QMI devices bridge: Prevent insertion of FDB entry with disallowed vlan netlink: rate-limit leftover bytes warning and print process name bridge: notify user space after fdb update net: qmi_wwan: add Netgear AirCard 341U net: fix wrong mac_len calculation for vlans batman-adv: fix NULL pointer dereferences net/mlx4_core: Reset RoCE VF gids when guest driver goes down emac: aggregation of v1-2 PLB errors for IER register emac: add missing support of 10mbit in emac/rgmii can: only rename enabled led triggers when changing the netdev name ipvs: Fix panic due to non-linear skb ...	2014-06-02 18:16:41 -07:00
Leon Yu	418c96ac15	net: filter: fix possible memory leak in __sk_prepare_filter() __sk_prepare_filter() was reworked in commit `bd4cf0ed3` (net: filter: rework/optimize internal BPF interpreter's instruction set) so that it should have uncharged memory once things went wrong. However that work isn't complete. Error is handled only in __sk_migrate_filter() while memory can still leak in the error path right after sk_chk_filter(). Fixes: `bd4cf0ed33` ("net: filter: rework/optimize internal BPF interpreter's instruction set") Signed-off-by: Leon Yu <chianglungyu@gmail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Tested-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 17:49:45 -07:00
Linus Torvalds	ca755175f2	Two md bugfixes for possible corruption when restarting reshape If a raid5/6 reshape is restarted (After stopping and re-assembling the array) and the array is marked read-only (or read-auto), then the reshape will appear to complete immediately, without actually moving anything around. This can result in corruption. There are two patches which do much the same thing in different places. They are separate because one is an older bug and so can be applied to more -stable kernels. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIUAwUAU40KTDnsnt1WYoG5AQJcaA/1GkoZit6LqLjiIQmsK9Ci/4TI+sNqYaSB 9SleSjWt+bcNCRY4sS3Wv0H580LmkoRR24wdei+mukoFa+bpBBs6PodPMABAVsnL VxlnUX+P4Ef77s2zJ8B5wCY3ftmecaQL3TdZf10+hIITacXSp7JmsLJXm3DW+Jvq DZsxJRBEQfsz5obZAZXnvPAcTSkqMT4QQ13nIEmaYEz+AYVn6Tcf8xwDBOcZM4u9 Gdi6BHNaY6RjSU1gsVblPYmWQyqqdgCJ6UEV/KYyY9rtFyozkvJ0SDWcu/kRA74A uydN5U6iVqJatY9l9eK2tV7GQkN+o+MWIA0JocTRZe67ihE4tWxiLRn/7fZdLVsX TV6zYar0M/ZSn3XioGi4hQ0tWDPpq/aCzCAk5JQpywgBmoaMqqh8rttwdCkWvK6P TNnaVfo3r9AMJY8MVm8in/efEhY6jUa3q2oDqCEKjuL916v9ODsxXloqTlbEy2KC NrKNLCZA2subbzPa3T8u4aKRBzl0xSBSig8ecrufSpDC1I0G+Mbuc8wrDzjAnI3N +fbQCxxRR0akcleZrFZD67avOa5/DsQqWJbcW1D5VCekJoZcgdz5CGJz/bNl+0i6 bwrvNWi6q1X2P4Nt2BBhk771xzNiUlufsI0x7SFIJxpDiGlxINkluXvnEQKFSzhr uYSrvTCQwg== =cTEe -----END PGP SIGNATURE----- Merge tag 'md/3.15-fixes' of git://neil.brown.name/md Pull two md bugfixes from Neil Brown: "Two md bugfixes for possible corruption when restarting reshape If a raid5/6 reshape is restarted (After stopping and re-assembling the array) and the array is marked read-only (or read-auto), then the reshape will appear to complete immediately, without actually moving anything around. This can result in corruption. There are two patches which do much the same thing in different places. They are separate because one is an older bug and so can be applied to more -stable kernels" * tag 'md/3.15-fixes' of git://neil.brown.name/md: md: always set MD_RECOVERY_INTR when interrupting a reshape thread. md: always set MD_RECOVERY_INTR when aborting a reshape or other "resync".	2014-06-02 17:04:37 -07:00
Jean Delvare	3aab01d800	net: ec_bhf: Add runtime dependencies The ec_bhf driver is specific to the Beckhoff CX embedded PC series. These are based on Intel x86 CPU. So we can add a dependency on X86, with COMPILE_TEST as an alternative to still allow for broader build-testing. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Darek Marcinkiewicz <reksio@newterm.pl> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 17:02:28 -07:00
Martin K. Petersen	3b8d2676d1	libata: Blacklist queued trim for Crucial M500 Queued trim only works for some users with MU05 firmware. Revert to blacklisting all firmware versions. Introduced by commit `d121f7d0cb` ("libata: Update queued trim blacklist for M5x0 drives") which this effectively reverts, while retaining the blacklisting of M550. See https://bugzilla.kernel.org/show_bug.cgi?id=71371 for reports of trouble with MU05 firmware. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-02 16:59:25 -07:00
Linus Torvalds	92b4e11315	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Peter Anvin: "A single quite small patch that managed to get overlooked earlier, to prevent a user space triggerable oops on systems without HPET" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, vdso: Fix an OOPS accessing the HPET mapping w/o an HPET	2014-06-02 16:57:23 -07:00
Linus Torvalds	8ee7a330fb	USB fixes for 3.15-rc8 Here are some fixes for 3.15-rc8 that resolve a number of tiny USB issues that have been reported, and there are some new device ids as well. All have been tested in linux-next. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEABECAAYFAlOM9RkACgkQMUfUDdst+ykhbACeKn9kmESO0QMC2ST0kQkQxHJX olYAoKbYZxvlZbdSJBmtYDbm1c5wrCfO =H5ae -----END PGP SIGNATURE----- Merge tag 'usb-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some fixes for 3.15-rc8 that resolve a number of tiny USB issues that have been reported, and there are some new device ids as well. All have been tested in linux-next" * tag 'usb-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: xhci: delete endpoints from bandwidth list before freeing whole device usb: pci-quirks: Prevent Sony VAIO t-series from switching usb ports USB: cdc-wdm: properly include types.h usb: cdc-wdm: export cdc-wdm uapi header USB: serial: option: add support for Novatel E371 PCIe card USB: ftdi_sio: add NovaTech OrionLXm product ID USB: io_ti: fix firmware download on big-endian machines (part 2) USB: Avoid runtime suspend loops for HCDs that can't handle suspend/resume	2014-06-02 16:56:42 -07:00
Linus Torvalds	da579dd6a1	Staging driver fixes for 3.15-rc8 Here are some staging driver fixes for 3.15. 3 are for the speakup drivers (one fix a regression caused in 3.15-rc, and the other 2 resolve a tty issue found by Ben Hutchings) The comedi and r8192e_pci driver fixes also resolve reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEABECAAYFAlOM9eUACgkQMUfUDdst+ykETACbBTSOZVE1pAIKGjGY2bcOtQET yKMAnil04hAbmz5L/5TNvdkUWu+7SfGi =K8cr -----END PGP SIGNATURE----- Merge tag 'staging-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are some staging driver fixes for 3.15. Three are for the speakup drivers (one fixes a regression caused in 3.15-rc, and the other two resolve a tty issue found by Ben Hutchings) The comedi and r8192e_pci driver fixes also resolve reported issues" * tag 'staging-3.15-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: r8192e_pci: fix htons error Staging: speakup: Update __speakup_paste_selection() tty (ab)usage to match vt Staging: speakup: Move pasting into a work item staging: comedi: ni_daq_700: add mux settling delay speakup: fix incorrect perms on speakup_acntsa.c	2014-06-02 16:55:18 -07:00
Yuchung Cheng	0cfa5c07d6	tcp: fix cwnd undo on DSACK in F-RTO This bug is discovered by an recent F-RTO issue on tcpm list https://www.ietf.org/mail-archive/web/tcpm/current/msg08794.html The bug is that currently F-RTO does not use DSACK to undo cwnd in certain cases: upon receiving an ACK after the RTO retransmission in F-RTO, and the ACK has DSACK indicating the retransmission is spurious, the sender only calls tcp_try_undo_loss() if some never retransmisted data is sacked (FLAG_ORIG_DATA_SACKED). The correct behavior is to unconditionally call tcp_try_undo_loss so the DSACK information is used properly to undo the cwnd reduction. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 16:50:49 -07:00
Eric W. Biederman	2d7a85f4b0	netlink: Only check file credentials for implicit destinations It was possible to get a setuid root or setcap executable to write to it's stdout or stderr (which has been set made a netlink socket) and inadvertently reconfigure the networking stack. To prevent this we check that both the creator of the socket and the currentl applications has permission to reconfigure the network stack. Unfortunately this breaks Zebra which always uses sendto/sendmsg and creates it's socket without any privileges. To keep Zebra working don't bother checking if the creator of the socket has privilege when a destination address is specified. Instead rely exclusively on the privileges of the sender of the socket. Note from Andy: This is exactly Eric's code except for some comment clarifications and formatting fixes. Neither I nor, I think, anyone else is thrilled with this approach, but I'm hesitant to wait on a better fix since 3.15 is almost here. Note to stable maintainers: This is a mess. An earlier series of patches in 3.15 fix a rather serious security issue (CVE-2014-0181), but they did so in a way that breaks Zebra. The offending series includes: commit `aa4cf9452f` Author: Eric W. Biederman <ebiederm@xmission.com> Date: Wed Apr 23 14:28:03 2014 -0700 net: Add variants of capable for use on netlink messages If a given kernel version is missing that series of fixes, it's probably worth backporting it and this patch. if that series is present, then this fix is critical if you care about Zebra. Cc: stable@vger.kernel.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 16:34:09 -07:00
Kristian Evensen	22fd2a52f7	ipheth: Add support for iPad 2 and iPad 3 Each iPad model has a different product id, this patch adds support for iPad 2 (pid 0x12a2) and iPad 3 (pid 0x12a6). Note that iPad 2 must be jailbroken and a third-party app must be used for tethering to work. On iPad 3, tethering works out of the box (assuming your ISP is nice). Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 16:11:55 -07:00
Jiri Pirko	9d0d68faea	team: fix mtu setting Now it is not possible to set mtu to team device which has a port enslaved to it. The reason is that when team_change_mtu() calls dev_set_mtu() for port device, notificator for NETDEV_PRECHANGEMTU event is called and team_device_event() returns NOTIFY_BAD forbidding the change. So fix this by returning NOTIFY_DONE here in case team is changing mtu in team_change_mtu(). Introduced-by: `3d249d4c` "net: introduce ethernet teaming device" Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 14:56:01 -07:00
Eric Dumazet	39c36094d7	net: fix inet_getid() and ipv6_select_ident() bugs I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery is disabled. Note how GSO/TSO packets do not have monotonically incrementing ID. 06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396) 06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212) 06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972) 06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292) 06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764) It appears I introduced this bug in linux-3.1. inet_getid() must return the old value of peer->ip_id_count, not the new one. Lets revert this part, and remove the prevention of a null identification field in IPv6 Fragment Extension Header, which is dubious and not even done properly. Fixes: `87c48fa3b4` ("ipv6: make fragment identifications less predictable") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 14:09:28 -07:00
Aleksander Morgado	fc0d6e9cd0	net: qmi_wwan: interface #11 in Sierra Wireless MC73xx is not QMI This interface is unusable, as the cdc-wdm character device doesn't reply to any QMI command. Also, the out-of-tree Sierra Wireless GobiNet driver fully skips it. Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 14:00:28 -07:00
Aleksander Morgado	9a793e71eb	net: qmi_wwan: add additional Sierra Wireless QMI devices A set of new VID/PIDs retrieved from the out-of-tree GobiNet/GobiSerial Sierra Wireless drivers. Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 14:00:28 -07:00
Toshiaki Makita	e0d7968ab6	bridge: Prevent insertion of FDB entry with disallowed vlan br_handle_local_finish() is allowing us to insert an FDB entry with disallowed vlan. For example, when port 1 and 2 are communicating in vlan 10, and even if vlan 10 is disallowed on port 3, port 3 can interfere with their communication by spoofed src mac address with vlan id 10. Note: Even if it is judged that a frame should not be learned, it should not be dropped because it is destined for not forwarding layer but higher layer. See IEEE 802.1Q-2011 8.13.10. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 13:38:23 -07:00
Michal Schmidt	bfc5184b69	netlink: rate-limit leftover bytes warning and print process name Any process is able to send netlink messages with leftover bytes. Make the warning rate-limited to prevent too much log spam. The warning is supposed to help find userspace bugs, so print the triggering command name to implicate the buggy program. [v2: Use pr_warn_ratelimited instead of printk_ratelimited.] Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-02 11:16:11 -07:00
Jon Maxwell	c65c7a3066	bridge: notify user space after fdb update There has been a number incidents recently where customers running KVM have reported that VM hosts on different Hypervisors are unreachable. Based on pcap traces we found that the bridge was broadcasting the ARP request out onto the network. However some NICs have an inbuilt switch which on occasions were broadcasting the VMs ARP request back through the physical NIC on the Hypervisor. This resulted in the bridge changing ports and incorrectly learning that the VMs mac address was external. As a result the ARP reply was directed back onto the external network and VM never updated it's ARP cache. This patch will notify the bridge command, after a fdb has been updated to identify such port toggling. Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-01 22:14:50 -07:00
Aleksander Morgado	4324be1e0b	net: qmi_wwan: add Netgear AirCard 341U Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-01 19:43:57 -07:00
Nikolay Aleksandrov	4b9b1cdf83	net: fix wrong mac_len calculation for vlans After `1e785f48d2` ("net: Start with correct mac_len in skb_network_protocol") skb->mac_len is used as a start of the calculation in skb_network_protocol() but that is not always correct. If skb->protocol == 8021Q/AD, usually the vlan header is already inserted in the skb (i.e. vlan reorder hdr == 0). Usually when the packet enters dev_hard_xmit it has mac_len == 0 so we take 2 bytes from the destination mac address (skb->data + VLAN_HLEN) as a type in skb_network_protocol() and return vlan_depth == 4. In the case where TSO is off, then the mac_len is set but it's == 18 (ETH_HLEN + VLAN_HLEN), so skb_network_protocol() returns a type from inside the packet and offset == 22. Also make vlan_depth unsigned as suggested before. As suggested by Eric Dumazet, move the while() loop in the if() so we can avoid additional testing in fast path. Here are few netperf tests + debug printk's to illustrate: cat netperf.tso-on.reorder-on.bugged - Vlan -> device (reorder on, default, this case is okay) MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.3.1 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 7111.54 [ 81.605435] skb->len 65226 skb->gso_size 1448 skb->proto 0x800 skb->mac_len 0 vlan_depth 0 type 0x800 - Vlan -> device (reorder off, bad) cat netperf.tso-on.reorder-off.bugged MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.3.1 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.00 241.35 [ 204.578332] skb->len 1518 skb->gso_size 0 skb->proto 0x8100 skb->mac_len 0 vlan_depth 4 type 0x5301 0x5301 are the last two bytes of the destination mac. And if we stop TSO, we may get even the following: [ 83.343156] skb->len 2966 skb->gso_size 1448 skb->proto 0x8100 skb->mac_len 18 vlan_depth 22 type 0xb84 Because mac_len already accounts for VLAN_HLEN. After the fix: cat netperf.tso-on.reorder-off.fixed MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.3.1 () port 0 AF_INET Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 10.01 5001.46 [ 81.888489] skb->len 65230 skb->gso_size 1448 skb->proto 0x8100 skb->mac_len 0 vlan_depth 18 type 0x800 CC: Vlad Yasevich <vyasevic@redhat.com> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: Daniel Borkman <dborkman@redhat.com> CC: David S. Miller <davem@davemloft.net> Fixes:1e785f48d29a ("net: Start with correct mac_len in skb_network_protocol") Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-01 19:39:13 -07:00
Linus Torvalds	fad01e866a	Linux 3.15-rc8	2014-06-01 19:12:24 -07:00
Linus Torvalds	204fe0380b	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fix from Ben Herrenschmidt: "Here's just one trivial patch to wire up sys_renameat2 which I seem to have completely missed so far. (My test build scripts fwd me warnings but miss the ones generated for missing syscalls)" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Wire renameat2() syscall	2014-06-01 18:30:07 -07:00
Linus Torvalds	568180a517	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS fixes from Ralf Baechle: "A fair number of fixes across the field. Nothing terribly complicated; the one liners in below changelog should be fairly descriptive. Noteworthy is the SB1 change which the result of changes to binutils resulting in one big gas warning for most files being assembled as well as the asid_cache and branch emulation fixes which fix corruption or possible uninteded behaviour of kernel or application code. The remainder of fixes are more platforms or subsystem specific" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: R46000: Fix Micro-assembler field overflow for R4600 V2 MIPS: ptrace: Avoid smp_processor_id() in preemptible code MIPS: Lemote 2F: cs5536: mfgpt: use raw locks MIPS: SB1: Fix excessive kernel warnings. MIPS: RC32434: fix broken PCI resource initialization MIPS: malta: memory.c: Initialize the 'memsize' variable MIPS: Fix typo when reporting cache and ftlb errors for ImgTec cores MIPS: Fix inconsistancy of __NR_Linux_syscalls value MIPS: Fix branch emulation of branch likely instructions. MIPS: Fix a typo error in AUDIT_ARCH definition MIPS: Change type of asid_cache to unsigned long	2014-06-01 18:28:58 -07:00
Linus Torvalds	32439700fe	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Various fixlets, mostly related to the (root-only) SCHED_DEADLINE policy, but also a hotplug bug fix and a fix for a NR_CPUS related overallocation bug causing a suspend/resume regression" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix hotplug vs. set_cpus_allowed_ptr() sched/cpupri: Replace NR_CPUS arrays sched/deadline: Replace NR_CPUS arrays sched/deadline: Restrict user params max value to 2^63 ns sched/deadline: Change sched_getparam() behaviour vs SCHED_DEADLINE sched: Disallow sched_attr::sched_policy < 0 sched: Make sched_setattr() correctly return -EFBIG	2014-06-01 18:26:59 -07:00
Benjamin Herrenschmidt	8212f58a9b	powerpc: Wire renameat2() syscall Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-02 09:24:27 +10:00
David S. Miller	6ce995c6f4	Included changes: - prevent NULL dereference in multicast code -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJTiZ58AAoJEJgn97Bh2u9e2x4QAIjWLDIo6feo4jH8l6q6R5iO cH/EXCtqk9GHNwvfZDNt+pF19ejzVk/TPnmmXTZ4QcElS9GuXe5WdWxiGcS5KEwa 0UNDRp8fgcBSV1Kqc/vbyKiQ4j69QtC1PPfLWUtxj/GYE0qHX/A1OzB9zROvoHJ7 sa3l8O5XRWiaxBYDkT0RfhHH0jeDdvm3I9yt8B+4B6c71094VIsfGXBVPp4tPrdg nkuzBdwF0HFPiFrlsfboJDLcXPLpRR93H1GsmfELYd5jQ4rtUhlcuEESq6573tvB TV93tkm/zmbwtInMoPI29qKL8t2478cJH7SvKvM4NiqMsB1zOhknhUXzElh9TPGA xyNivxJraYJzL53XguBFO8A8fP1k/E8Z6UQXJbgry4lu+6qZ60e0/J8zGxGpSamP i1JX0MAVPX6T4MAlZ70LMxfmzJ5sSNkkYyXobG+aBa/AgzRsXVvG4So1qi364COx btCxgBXK1Z20ZuNclY8/J06D8EbTXI5y5MCSDvMCOHQlb5mjBl34RtFVw+5/QXkg v2suc7T/YLOPNtZktZC2506caPHoOlwEVvkyA55p+qdkcD/Dd5Iv4Hndi+g+C5gv O2ja7gUQco1R8ElormKW9rE7OvjiUlowNJmguXWAdzc9FC0yISpP66BAGjBqwhF9 6YibEebXMQICjxAVTEAM =fvfu -----END PGP SIGNATURE----- Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge Included changes: - prevent NULL dereference in multicast code Antonion Quartulli says: ==================== pull request net: batman-adv 20140527 here you have another very small fix intended for net/linux-3.15. It prevents some multicast functions from dereferencing a NULL pointer. (Actually it was nothing more than a typo) I hope it is not too late for such a small patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-31 20:01:47 -07:00
Linus Torvalds	a4bf79eb6a	Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core futex/rtmutex fixes from Thomas Gleixner: "Three fixlets for long standing issues in the futex/rtmutex code unearthed by Dave Jones syscall fuzzer: - Add missing early deadlock detection checks in the futex code - Prevent user space from attaching a futex to kernel threads - Make the deadlock detector of rtmutex work again Looks large, but is more comments than code change" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rtmutex: Fix deadlock detector for real futex: Prevent attaching to kernel threads futex: Add another early deadlock detection check	2014-05-31 09:47:55 -07:00
Linus Torvalds	80e0679469	Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Mostly quiet now: i915: fixing userspace visiblie issues, all stable marked radeon: one more pll fix, two crashers, one suspend/resume regression" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/radeon: Resume fbcon last drm/radeon: only allocate necessary size for vm bo list drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission drm/radeon: avoid crash if VM command submission isn't available drm/radeon: lower the ref * post PLL maximum once more drm/i915: Prevent negative relocation deltas from wrapping drm/i915: Only copy back the modified fields to userspace from execbuffer drm/i915: Fix dynamic allocation of physical handles	2014-05-31 09:19:02 -07:00
Linus Torvalds	9f12600fe4	dcache: add missing lockdep annotation lock_parent() very much on purpose does nested locking of dentries, and is careful to maintain the right order (lock parent first). But because it didn't annotate the nested locking order, lockdep thought it might be a deadlock on d_lock, and complained. Add the proper annotation for the inner locking of the child dentry to make lockdep happy. Introduced by commit `046b961b45` ("shrink_dentry_list(): take parent's ->d_lock earlier"). Reported-and-tested-by: Josh Boyer <jwboyer@fedoraproject.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-05-31 09:13:21 -07:00
Marek Lindner	af0a171c07	batman-adv: fix NULL pointer dereferences Was introduced with `4c8755d69c` ("batman-adv: Send multicast packets to nodes with a WANT_ALL flag") Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>	2014-05-31 10:07:14 +02:00
David S. Miller	dbfc4b698a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== The following patchset contains a late fix for IPVS: * Fix crash when trying to remove the transport header with non-linear skbuffs, this was introduced in 3.6-rc. Patch from Peter Christensen via the IPVS folks. I'll pass this to -stable once this hits mainstream. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:56:09 -07:00
David S. Miller	554215c5ca	linux-can-fixes-for-3.15-20140528 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEABECAAYFAlOFkZ4ACgkQjTAFq1RaXHOYCQCeJwwa8XQMw2HkGemGMBSpcUHb MLwAoISCoqgrXWq/lBBzDqlVXik8fR4t =gHVg -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-3.15-20140528' of git://gitorious.org/linux-can/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2014-05-28 here's a pull request for v3.15, hope it's not too late. Oliver Hartkopp fixed a bug in the CAN led trigger device renaming code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 17:32:53 -07:00
Jack Morgenstein	111c6094bd	net/mlx4_core: Reset RoCE VF gids when guest driver goes down Reset the GIDs assigned to a VF in the port RoCE GID table when that guest goes down (either crashes or goes down cleanly). As part of this fix, we refactor the RoCE gid table driver copy, moving it to the mlx4_port_info structure (together with the MAC and VLAN tables). As with the MAC and VLAN tables, we now use a mutex per port for the GID table so that modifying the driver copy and modifying the firmware copy of a port GID table becomes an atomic operation (thus avoiding driver-copy/FW-copy mismatches). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 16:57:52 -07:00
Ivan Mikhaylov	09271db6e0	emac: aggregation of v1-2 PLB errors for IER register Aggreagation of version 1-2 because of version 1 can hit PLB errors too. If it's not set so we missing events for PLB bits and driver can't process those interrupts. Signed-off-by: Ivan Mikhaylov <ivan@ru.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 16:29:57 -07:00
Ivan Mikhaylov	faacd3af0c	emac: add missing support of 10mbit in emac/rgmii In chips of emac/rgmii b'000' for 0/1 channel isn't suitable which resulted in non working network interface in this mode. Signed-off-by: Ivan Mikhaylov <ivan@ru.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-30 16:29:57 -07:00
Daniel Vetter	18ee37a485	drm/radeon: Resume fbcon last So a few people complained that commit `177cf92de4` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Apr 1 22:14:59 2014 +0200 drm/crtc-helpers: fix dpms on logic which was merged into 3.15-rc1, broke resume on radeons. Strangely git bisect lead everyone to commit `25f397a429` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri Jul 19 18:57:11 2013 +0200 drm/crtc-helper: explicit DPMS on after modeset which was merged long ago and actually part of 3.14. Digging deeper I've noticed (again) that the call to drm_helper_resume_force_mode in the radeon resume handlers was a no-op previously because everything gets shut down on suspend. radeon does this with explicit calls to drm_helper_connector_dpms with DPMS_OFF. But with 177c we now force the dpms state to ON, so suddenly resume_force_mode actually forced the crtcs back on. This is the intention of the change after all, the problem is that radeon resumes the fbdev console layer _before_ restoring the display, through calling fb_set_suspend. And fbcon does an immediate ->set_par, which in turn causes the same forced mode restore to happen. Two concurrent modeset operations didn't lead to happiness. Fix this by delaying the fbcon resume until the end of the readeon resum functions. v2: Fix up a bit of the spelling fail. References: https://lkml.org/lkml/2014/5/29/1043 References: https://lkml.org/lkml/2014/5/2/388 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=74751 Tested-by: Ken Moffat <zarniwhoop@ntlworld.com> Cc: Alex Deucher <alexdeucher@gmail.com> Cc: Ken Moffat <zarniwhoop@ntlworld.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@gmail.com>	2014-05-31 09:19:51 +10:00
Dave Airlie	1446e04c9b	Merge branch 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux into drm-fixes this is the next pull request for stashed up radeon fixes for 3.15. This is finally calming down with only four patches in this pull request. * 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux: drm/radeon: only allocate necessary size for vm bo list drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission drm/radeon: avoid crash if VM command submission isn't available drm/radeon: lower the ref * post PLL maximum once more	2014-05-31 09:19:05 +10:00
Linus Torvalds	1487385edb	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input subsystem fixes from Dmitry Torokhov: "A couple of driver/build fixups and also redone quirk for Synaptics touchpads on Lenovo boxes (now using PNP IDs instead of DMI data to limit number of quirks)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: synaptics - change min/max quirk table to pnp-id matching Input: synaptics - add a matches_pnp_id helper function Input: synaptics - T540p - unify with other LEN0034 models Input: synaptics - add min/max quirk for the ThinkPad W540 Input: ambakmi - request a shared interrupt for AMBA KMI devices Input: pxa27x-keypad - fix generating scancode Input: atmel-wm97xx - only build for AVR32 Input: fix ps2/serio module dependency	2014-05-30 12:07:48 -07:00
Linus Torvalds	1326af2464	Regression fix for the IEEE 1394 subsystem: Re-enable IRQ-based asynchronous request reception at addresses below 128 TB. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJTiIlwAAoJEHnzb7JUXXnQ+4oQAJ3ZHtqDJNP99qokiyVorDSw hk2b/nCF2A7PYGgYdyImDx+xiOdI7FE3T6vpDBrel/Vfc/eny3g1PSmklF0pmrgR LKyEAKmqRrEFKNevJPIAOyvf8jMV4E7mkKKp2iGN5s77CdW/KeoNfeP8yzRzMW3n ChG9yZIPWMVuoiGBL74iOZujETP3Y62CVRP9Anz3uG2I7oGQN+09uK+Qc58SR917 fbvJh8f2Yrf+st7oogwpFvKRlDqyskmoonWc39OCPcGeumQ9LycTs5KTMx4htVjW jf8bJDQ/xmHCvPpZpQtZiMbj5pmMkD9cSt0KvMHv4TuV/1NBFECITjoQ03//+byb RmxTH8GRryfpafTLL1Qlf+boecjEK9yuJ/k1YbDEt9YTc6x29oTqJjeXbli5r3FX f0MYLpDEdDH4GuKMgdvDt+CFzGI7NHWupjyYbm/uT0LCEvLhmKnh3z+3HhmUO2wD 7PoDp63t779YhiScnfA+P5CXmjnMjZNtqeCyJYiCDrzcCS3VVQVocFf3iEKF8xrf gOju+BZHuNzw5S3Tn0kf3LaexUW9eMD1LAYDooF1eqpl/hrmsYpLhw3mI7601CDa UNHTY2Wz6Y/2PwkpzKgsHrrKnyaQWak/kZl0jFy7lZhvJqFyNPe8nVWYRdPdVeHk /b/TKJZDiRQXZCrcfEXc =GDAT -----END PGP SIGNATURE----- Merge tag 'firewire-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Stefan Richter: "A regression fix for the IEEE 1394 subsystem: re-enable IRQ-based asynchronous request reception at addresses below 128 TB" * tag 'firewire-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: revert to 4 GB RDMA, fix protocols using Memory Space	2014-05-30 12:06:15 -07:00
Linus Torvalds	24e19d279f	A dm-cache stable fix to split discards on cache block boundaries because dm-cache cannot yet handle discards that span cache blocks. Really fix a dm-mpath LOCKDEP warning that was introduced in -rc1. Add a 'no_space_timeout' control to dm-thinp to restore the ability to queue IO indefinitely when no data space is available. This fixes a change in behavior that was introduced in -rc6 where the timeout couldn't be disabled. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJTh9QAAAoJEMUj8QotnQNaNpYH/j07FeH8YlxXRcFzDi7xRVtx luK5b9fLLlmPwW2eKSrvpI8Le4jwDvLwBmpEvN9/wyPiRDSUnYIyYdoV7RJXX2LT wqXatObb84fwQBJ6/q8o2YMzU5ODa5XT6KGEZyD4cHdAZ9FZSwfgqhslyrBJDkSN JBFfkXu066qw8cuYA6KFv4DwBf5eHAt5AjV/QPGd5zGXwETHLZ4ypgpwYHAGbdXa MgfHetwtEnJYvVQex/e+9xC5IDc4/BEAhZq4n3YmEJjNq8EbX15udHmCX7S2M5pT +9tNjUMz4j9BhoC9F8ntRz0pxWZtJK9hGojO4xoXqOCOHgp1xLQd/tHrFZS0v8E= =u5Xd -----END PGP SIGNATURE----- Merge tag 'dm-3.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device-mapper fixes from Mike Snitzer: "A dm-cache stable fix to split discards on cache block boundaries because dm-cache cannot yet handle discards that span cache blocks. Really fix a dm-mpath LOCKDEP warning that was introduced in -rc1. Add a 'no_space_timeout' control to dm-thinp to restore the ability to queue IO indefinitely when no data space is available. This fixes a change in behavior that was introduced in -rc6 where the timeout couldn't be disabled" * tag 'dm-3.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm mpath: really fix lockdep warning dm cache: always split discards on cache block boundaries dm thin: add 'no_space_timeout' dm-thin-pool module param	2014-05-30 12:04:56 -07:00
Minchan Kim	6538b8ea88	x86_64: expand kernel stack to 16K While I play inhouse patches with much memory pressure on qemu-kvm, 3.14 kernel was randomly crashed. The reason was kernel stack overflow. When I investigated the problem, the callstack was a little bit deeper by involve with reclaim functions but not direct reclaim path. I tried to diet stack size of some functions related with alloc/reclaim so did a hundred of byte but overflow was't disappeard so that I encounter overflow by another deeper callstack on reclaim/allocator path. Of course, we might sweep every sites we have found for reducing stack usage but I'm not sure how long it saves the world(surely, lots of developer start to add nice features which will use stack agains) and if we consider another more complex feature in I/O layer and/or reclaim path, it might be better to increase stack size( meanwhile, stack usage on 64bit machine was doubled compared to 32bit while it have sticked to 8K. Hmm, it's not a fair to me and arm64 already expaned to 16K. ) So, my stupid idea is just let's expand stack size and keep an eye toward stack consumption on each kernel functions via stacktrace of ftrace. For example, we can have a bar like that each funcion shouldn't exceed 200K and emit the warning when some function consumes more in runtime. Of course, it could make false positive but at least, it could make a chance to think over it. I guess this topic was discussed several time so there might be strong reason not to increase kernel stack size on x86_64, for me not knowing so Ccing x86_64 maintainers, other MM guys and virtio maintainers. Here's an example call trace using up the kernel stack: Depth Size Location (51 entries) ----- ---- -------- 0) 7696 16 lookup_address 1) 7680 16 _lookup_address_cpa.isra.3 2) 7664 24 __change_page_attr_set_clr 3) 7640 392 kernel_map_pages 4) 7248 256 get_page_from_freelist 5) 6992 352 __alloc_pages_nodemask 6) 6640 8 alloc_pages_current 7) 6632 168 new_slab 8) 6464 8 __slab_alloc 9) 6456 80 __kmalloc 10) 6376 376 vring_add_indirect 11) 6000 144 virtqueue_add_sgs 12) 5856 288 __virtblk_add_req 13) 5568 96 virtio_queue_rq 14) 5472 128 __blk_mq_run_hw_queue 15) 5344 16 blk_mq_run_hw_queue 16) 5328 96 blk_mq_insert_requests 17) 5232 112 blk_mq_flush_plug_list 18) 5120 112 blk_flush_plug_list 19) 5008 64 io_schedule_timeout 20) 4944 128 mempool_alloc 21) 4816 96 bio_alloc_bioset 22) 4720 48 get_swap_bio 23) 4672 160 __swap_writepage 24) 4512 32 swap_writepage 25) 4480 320 shrink_page_list 26) 4160 208 shrink_inactive_list 27) 3952 304 shrink_lruvec 28) 3648 80 shrink_zone 29) 3568 128 do_try_to_free_pages 30) 3440 208 try_to_free_pages 31) 3232 352 __alloc_pages_nodemask 32) 2880 8 alloc_pages_current 33) 2872 200 __page_cache_alloc 34) 2672 80 find_or_create_page 35) 2592 80 ext4_mb_load_buddy 36) 2512 176 ext4_mb_regular_allocator 37) 2336 128 ext4_mb_new_blocks 38) 2208 256 ext4_ext_map_blocks 39) 1952 160 ext4_map_blocks 40) 1792 384 ext4_writepages 41) 1408 16 do_writepages 42) 1392 96 __writeback_single_inode 43) 1296 176 writeback_sb_inodes 44) 1120 80 __writeback_inodes_wb 45) 1040 160 wb_writeback 46) 880 208 bdi_writeback_workfn 47) 672 144 process_one_work 48) 528 112 worker_thread 49) 416 240 kthread 50) 176 176 ret_from_fork [ Note: the problem is exacerbated by certain gcc versions that seem to generate much bigger stack frames due to apparently bad coalescing of temporaries and generating too many spills. Rusty saw gcc-4.6.4 using 35% more stack on the virtio path than 4.8.2 does, for example. Minchan not only uses such a bad gcc version (4.6.3 in his case), but some of the stack use is due to debugging (CONFIG_DEBUG_PAGEALLOC is what causes that kernel_map_pages() frame, for example). But we're clearly getting too close. The VM code also seems to have excessive stack frames partly for the same compiler reason, triggered by excessive inlining and lots of function arguments. We need to improve on our stack use, but in the meantime let's do this simple stack increase too. Unlike most earlier reports, there is nothing simple that stands out as being really horribly wrong here, apart from the fact that the stack frames are just bigger than they should need to be. - Linus ] Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Jones <davej@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S Tsirkin <mst@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: PJ Waskiewicz <pjwaskiewicz@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-05-30 11:52:51 -07:00
Linus Torvalds	6f6111e4a7	Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs dcache livelock fix from Al Viro: "Fixes for livelocks in shrink_dentry_list() introduced by fixes to shrink list corruption; the root cause was that trylock of parent's ->d_lock could be disrupted by d_walk() happening on other CPUs, resulting in shrink_dentry_list() making no progress and the same d_walk() being called again and again for as long as shrink_dentry_list() doesn't get past that mess. The solution is to have shrink_dentry_list() treat that trylock failure not as 'try to do the same thing again', but 'lock them in the right order'" * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: dentry_kill() doesn't need the second argument now dealing with the rest of shrink_dentry_list() livelock shrink_dentry_list(): take parent's ->d_lock earlier expand dentry_kill(dentry, 0) in shrink_dentry_list() split dentry_kill() lift the "already marked killed" case into shrink_dentry_list()	2014-05-30 09:52:55 -07:00

1 2 3 4 5 ...

442770 Commits