linux

Author	SHA1	Message	Date
Linus Torvalds	d46d0256cd	Merge branch 'akpm' (patches from Andrew) Merge various fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm, page_alloc: recalculate the preferred zoneref if the context can ignore memory policies mm, page_alloc: reset zonelist iterator after resetting fair zone allocation policy mm, oom_reaper: do not use siglock in try_oom_reaper() mm, page_alloc: prevent infinite loop in buffered_rmqueue() checkpatch: reduce git commit description style false positives mm/z3fold.c: avoid modifying HEADLESS page and minor cleanup memcg: add RCU locking around css_for_each_descendant_pre() in memcg_offline_kmem() mm: check the return value of lookup_page_ext for all call sites kdump: fix dmesg gdbmacro to work with record based printk mm: fix overflow in vm_map_ram()	2016-06-04 10:51:29 -07:00
Al Viro	fac7d1917d	fix EOPENSTALE bug in do_last() EOPENSTALE occuring at the last component of a trailing symlink ends up with do_last() retrying its lookup. After the symlink body has been discarded. The thing is, all this retry_lookup logics in there is not needed at all - the upper layers will do the right thing if we simply return that -EOPENSTALE as we would with any other error. Trying to microoptimize in do_last() is a lot of headache for no good reason. Cc: stable@vger.kernel.org # v4.2+ Tested-by: Oleg Drokin <green@linuxhacker.ru> Reviewed-and-Tested-by: Jeff Layton <jlayton@poochiereds.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-06-04 11:59:04 -04:00
Geert Uytterhoeven	182fd9eecb	MAINTAINERS: Add file patterns for wireless device tree bindings Submitters of device tree binding documentation may forget to CC the subsystem maintainer if this is missing. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kalle Valo <kvalo@codeaurora.org> Cc: linux-wireless@vger.kernel.org Signed-off-by: Kalle Valo <kvalo@codeaurora.org>	2016-06-04 17:24:02 +03:00
David S. Miller	76f21b9900	net: Add docbook description for 'mtu' arg to skb_gso_validate_mtu() Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 22:56:28 -07:00
David S. Miller	3b55a537d0	sctp: Fix warning in sctp_packet_transmit_chunk() size_t objects should be printed with %Z printf format. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 22:53:26 -07:00
Yuval Mintz	81b1251d3b	qed: Fix next-ptr chains for BE / 32-bit Commit `a91eb52abb` ("qed: Revisit chain implementation") contains an incorrect implementation for BE platforms, as device's regpairs containing addresses are LE and they're not converted correctly when read back. In addition, it raises a compilation warning for 32-bit platforms where dma_addr_t is a 32-bit variable. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 22:42:26 -07:00
David S. Miller	03c7f70bee	Merge branch 'qed-roce-iscsi' Yuval Mintz says: ==================== qed: RocE & iSCSI infrastructure We plan on sending 2 new protocol drivers in the imminent future - both our RoCE [qedr] and iSCSI [qedi] drivers. As both submissions would be rather massive and in order to avoid collisions between them, the common infrastructure on the qed side was prepared as an independent patch-series to be sent ahead of those 2 submissions. This patch series introduces in QED 2 new 'ids' - one for iscsi and one for roce. It then goes and adds logic required for configuring said protocols in HW. Notice it doesn't actually add any client using said ids, but rather only the infrastructure to allow their later usage. What this patch doesn't contain is the slowpath protocol-configuration toward the firmware. I.e., it contains register-setting logic, memory allocations, etc., but not actual flow-related configuration specific to the protocl. Those would be sent as part of the protocol driver submissions. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 20:08:45 -04:00
Yuval Mintz	dbb799c397	qed: Initialize hardware for new protocols RoCE and iSCSI would require some added/changed hw configuration in order to properly run; The biggest single change being the requirement of allocating and mapping host memory for several HW blocks that aren't being used by qede [SRC, QM, TM, etc.]. In addition, whereas qede is only using context memory for HW blocks, the new protocol would also require task memories to be added. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 20:08:40 -04:00
Yuval Mintz	c5ac93191d	qed: Add iscsi/rdma personalities This patch adds in the ecore 2 new personalities in addition to QED_PCI_ETH - QED_PCI_ISCSI and QED_PCI_ETH_ROCE. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 20:08:39 -04:00
Yuval Mintz	7a9b6b8f6e	qed: Add common HSI for new protocols This adds the qed portion of the RoCE & iSCSI firmware HSI, as well as adding several new common HSI files which would be required by both qed and qed* protocols. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 20:08:39 -04:00
Yuval Mintz	a91eb52abb	qed: Revisit chain implementation RoCE driver is going to need a 32-bit chain [current chain implementation for qed* currently supports only 16-bit producer/consumer chains]. This patch adds said support, as well as doing other slight tweaks and modifications to qed's chain API. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 20:08:39 -04:00
David S. Miller	e7eacc9e0b	Merge branch 'mediatek-fixes' John Crispin says: ==================== net-next: mediatek: improve phy support The current driver did not handle the RGMII delay modes and asymmetric flow control properly. The mii_bus is not freed properly. Also add support for fixed-phy allowing the driver to work on SoCs that have an internal gigabit switch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:54:23 -04:00
John Crispin	37920fce0f	net-next: mediatek: properly handle RGMII modes If an external Gigabit PHY is connected to either of the MACs we need to be able to tell the PHY to use a delay. Not doing so will result in heavy packet loss and/or data corruption when using PHYs such as the IC+ IP1001. We tell the PHY which MII delay mode to use via the devictree. The ethernet driver needs to be adapted to handle all 3 rgmii-*id modes in the same way as normal rgmii when setting up the MAC. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:54:16 -04:00
John Crispin	0c72c50f6f	net-next: mediatek: add fixed-phy support The MT7623 SoC has a builtin gigabit switch. If we want to use it, GMAC1 needs to be configured using a fixed link speed and flow control settings. The easiest way to do this is to used the fixed-phy driver, allowing us to reuse the existing mdio polling code to setup the MAC. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:54:16 -04:00
John Crispin	08ef55c6f2	net-next: mediatek: fix gigabit and flow control advertisement The current code will not setup the PHYs advertisement features correctly. Fix this and properly advertise Gigabit features and properly handle asymmetric pause frames. Signed-off-by: Sean Wang <keyhaede@gmail.com> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:54:16 -04:00
John Crispin	207bdf1844	net-next: mediatek: use mdiobus_free() in favour of kfree() The driver currently uses kfree() to clear the mii_bus. This is not the correct way to clear the memory and mdiobus_free() should be used instead. This patch fixes the two instances where this happens in the driver. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:54:16 -04:00
Ivan Khoronzhuk	330348d942	net: ethernet: ti: cpsw: remove unused priv lock There is no reason in this lock. At least for now. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:43:26 -04:00
Joe Perches	9b6d53985f	rxrpc: Use pr_<level> and pr_fmt, reduce object size a few KB Use the more common kernel logging style and reduce object size. The logging message prefix changes from a mixture of "RxRPC:" and "RXRPC:" to "af_rxrpc: ". $ size net/rxrpc/built-in.o* text data bss dec hex filename 64172 1972 8304 74448 122d0 net/rxrpc/built-in.o.new 67512 1972 8304 77788 12fdc net/rxrpc/built-in.o.old Miscellanea: o Consolidate the ASSERT macros to use a single pr_err call with decimal and hexadecimal output and a stringified #OP argument Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:41:31 -04:00
Haiyang Zhang	cb2911fed6	hv_netvsc: Fix VF register on vlan devices Added a condition to avoid vlan devices with same MAC registering as VF. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:40:05 -04:00
David S. Miller	db54275dff	Merge branch 'sctp-gso' Marcelo Ricardo Leitner says: ==================== sctp: Add GSO support This patchset adds sctp GSO support. Performance tests indicates that increases throughput by 10% if using bigger chunk sizes, specially if bigger than MTU. For small chunks, it doesn't help much if not using heavy firewall rules. For small chunks it will probably be of more use once we get something like MSG_MORE as David Laight had suggested. overall changes: v1->v2: Added support for receiving GSO frames on SCTP stack, as requested by Dave Miller. v2->v3: Consider sctphdr size in skb_gso_transport_seglen() rebased due to `5c7cdf339a` ("gso: Remove arbitrary checks for unsupported GSO") ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:37:26 -04:00
Marcelo Ricardo Leitner	942b3235bf	sctp: improve debug message to also log curr pkt and new chunk size This is useful for debugging packet sizes. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:37:22 -04:00
Marcelo Ricardo Leitner	90017accff	sctp: Add GSO support SCTP has this pecualiarity that its packets cannot be just segmented to (P)MTU. Its chunks must be contained in IP segments, padding respected. So we can't just generate a big skb, set gso_size to the fragmentation point and deliver it to IP layer. This patch takes a different approach. SCTP will now build a skb as it would be if it was received using GRO. That is, there will be a cover skb with protocol headers and children ones containing the actual segments, already segmented to a way that respects SCTP RFCs. With that, we can tell skb_segment() to just split based on frag_list, trusting its sizes are already in accordance. This way SCTP can benefit from GSO and instead of passing several packets through the stack, it can pass a single large packet. v2: - Added support for receiving GSO frames, as requested by Dave Miller. - Clear skb->cb if packet is GSO (otherwise it's not used by SCTP) - Added heuristics similar to what we have in TCP for not generating single GSO packets that fills cwnd. v3: - consider sctphdr size in skb_gso_transport_seglen() - rebased due to `5c7cdf339a` ("gso: Remove arbitrary checks for unsupported GSO") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:37:21 -04:00
Marcelo Ricardo Leitner	3acb50c18d	sctp: delay as much as possible skb_linearize This patch is a preparation for the GSO one. In order to successfully handle GSO packets on rx path we must not call skb_linearize, otherwise it defeats any gain GSO may have had. This patch thus delays as much as possible the call to skb_linearize, leaving it to sctp_inq_pop() moment. For that the sanity checks performed now know how to deal with fragments. One positive side-effect of this is that if the socket is backlogged it will have the chance of doing it on backlog processing instead of during softirq. With this move, it's evident that a check for non-linearity in sctp_inq_pop was ineffective and is now removed. Note that a similar check is performed a bit below this one. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:37:21 -04:00
Marcelo Ricardo Leitner	ae7ef81ef0	skbuff: introduce skb_gso_validate_mtu skb_gso_network_seglen is not enough for checking fragment sizes if skb is using GSO_BY_FRAGS as we have to check frag per frag. This patch introduces skb_gso_validate_mtu, based on the former, which will wrap the use case inside it as all calls to skb_gso_network_seglen were to validate if it fits on a given TMU, and improve the check. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:37:21 -04:00
Marcelo Ricardo Leitner	3953c46c3a	sk_buff: allow segmenting based on frag sizes This patch allows segmenting a skb based on its frags sizes instead of based on a fixed value. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:37:21 -04:00
Marcelo Ricardo Leitner	57c0565039	skbuff: export skb_gro_receive sctp GSO requires it and sctp can be compiled as a module, so we need to export this function. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:37:21 -04:00
Marcelo Ricardo Leitner	f6c382fc55	loopback: make use of NETIF_F_GSO_SOFTWARE NETIF_F_GSO_SOFTWARE was defined to list all GSO software types, so lets make use of it in loopback code. Note that veth/vxlan/others already uses it. Within this patch series, this patch causes lo to pick up SCTP GSO feature automatically (as it's added to NETIF_F_GSO_SOFTWARE) and thus avoiding segmentation if possible. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Tested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:37:21 -04:00
Ivan Khoronzhuk	8478b6cdc1	net: ethernet: ti: cpsw: fix rx-usecs interrupt pacing consistency The rx-usecs shouldn't be changed while interface down/up. Currently, for instance, if it's set to 100us, after interface down/up it's 500us. It's a hidden bug that can lead to lavish interrupt pacing time increasing while "down/up" up to max value. Steps to reproduce: - set rx-usecs to be 100us - down/up interface - read new unexpected rx-usecs Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:35:06 -04:00
Yangbo Lu	9c8b0778e4	gianfar: fix the last transmit buffer descriptor When the transmit hardware timestamping is enabled, an additional TxBD would be added and would be set as the last TxBD with TXBD_LAST and TXBD_INTERRUPT. However this has been broken by a patch recently. This made the software couldn't get transmit hardware timestamps and resulted in call trace. So, this patch is to fix this issue. Fixes: `48963b4492` ("gianfar: Remove redundant ops for do_tstamp from xmit()") Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:32:23 -04:00
Bhaktipriya Shridhar	f2edc4e1b0	net: fjes: fjes_main: Remove create_workqueue alloc_workqueue replaces deprecated create_workqueue(). The workqueue adapter->txrx_wq has workitem &adapter->raise_intr_rxdata_task per adapter. Extended Socket Network Device is shared memory based, so someone's transmission denotes other's reception. raise_intr_rxdata_task raises interruption of receivers from the sender in order to notify receivers. The workqueue adapter->control_wq has workitem &adapter->interrupt_watch_task per adapter. interrupt_watch_task is used to prevent delay of interrupts. Dedicated workqueues have been used in both cases since the workitems on the workqueues are involved in normal device operation and require forward progress under memory pressure. max_active has been set to 0 since there is no need for throttling the number of active work items. Since network devices may be used for memory reclaim, WQ_MEM_RECLAIM has been set to guarantee forward progress. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:29:42 -04:00
WANG Cong	8d5958f424	sch_tbf: update backlog as well Fixes: `2ccccf5fb4` ("net_sched: update hierarchical backlog too") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:24:04 -04:00
WANG Cong	d7f4f332f0	sch_red: update backlog as well Fixes: `2ccccf5fb4` ("net_sched: update hierarchical backlog too") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:24:04 -04:00
WANG Cong	6a73b571b6	sch_drr: update backlog as well Fixes: `2ccccf5fb4` ("net_sched: update hierarchical backlog too") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:24:04 -04:00
WANG Cong	6529d75ad9	sch_prio: update backlog as well We need to update backlog too when we update qlen. Joint work with Stas. Reported-by: Stas Nichiporovich <stasn77@gmail.com> Tested-by: Stas Nichiporovich <stasn77@gmail.com> Fixes: `2ccccf5fb4` ("net_sched: update hierarchical backlog too") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:24:04 -04:00
WANG Cong	357cc9b4a8	sch_hfsc: always keep backlog updated hfsc updates backlog lazily, that is only when we dump the stats. This is problematic after we begin to update backlog in qdisc_tree_reduce_backlog(). Reported-by: Stas Nichiporovich <stasn77@gmail.com> Tested-by: Stas Nichiporovich <stasn77@gmail.com> Fixes: `2ccccf5fb4` ("net_sched: update hierarchical backlog too") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-06-03 19:24:04 -04:00
Linus Torvalds	8c52b6dcdd	Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: - a few simple fixes for fallout from the recent gic-v3 changes - a workaround for a Cavium thunderX erratum - a bugfix for the pic32 irqchip to make external interrupts work proper - a missing return value in the generic IPI management code * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-pic32-evic: Fix bug with external interrupts. irqchip/gicv3-its: numa: Enable workaround for Cavium thunderx erratum 23144 irqchip/gic-v3: Fix quiescence check in gic_enable_redist irqchip/gic-v3: Fix copy+paste mistakes in defines irqchip/gic-v3: Fix ICC_SGI1R_EL1.INTID decoding mask genirq: Fix missing return value in irq_destroy_ipi()	2016-06-03 16:12:35 -07:00
Mel Gorman	e46e7b77c9	mm, page_alloc: recalculate the preferred zoneref if the context can ignore memory policies The optimistic fast path may use cpuset_current_mems_allowed instead of of a NULL nodemask supplied by the caller for cpuset allocations. The preferred zone is calculated on this basis for statistic purposes and as a starting point in the zonelist iterator. However, if the context can ignore memory policies due to being atomic or being able to ignore watermarks then the starting point in the zonelist iterator is no longer correct. This patch resets the zonelist iterator in the allocator slowpath if the context can ignore memory policies. This will alter the zone used for statistics but only after it is known that it makes sense for that context. Resetting it before entering the slowpath would potentially allow an ALLOC_CPUSET allocation to be accounted for against the wrong zone. Note that while nodemask is not explicitly set to the original nodemask, it would only have been overwritten if cpuset_enabled() and it was reset before the slowpath was entered. Link: http://lkml.kernel.org/r/20160602103936.GU2527@techsingularity.net Fixes: `c33d6c06f6` ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-06-03 16:02:57 -07:00
Mel Gorman	0d0bd89435	mm, page_alloc: reset zonelist iterator after resetting fair zone allocation policy Geert Uytterhoeven reported the following problem that bisected to commit `c33d6c06f6` ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") on m68k/ARAnyM BUG: scheduling while atomic: cron/668/0x10c9a0c0 Modules linked in: CPU: 0 PID: 668 Comm: cron Not tainted 4.6.0-atari-05133-gc33d6c06f60f710f #364 Call Trace: [<0003d7d0>] __schedule_bug+0x40/0x54 __schedule+0x312/0x388 __schedule+0x0/0x388 prepare_to_wait+0x0/0x52 schedule+0x64/0x82 schedule_timeout+0xda/0x104 set_next_entity+0x18/0x40 pick_next_task_fair+0x78/0xda io_schedule_timeout+0x36/0x4a bit_wait_io+0x0/0x40 bit_wait_io+0x12/0x40 __wait_on_bit+0x46/0x76 wait_on_page_bit_killable+0x64/0x6c bit_wait_io+0x0/0x40 wake_bit_function+0x0/0x4e __lock_page_or_retry+0xde/0x124 do_scan_async+0x114/0x17c lookup_swap_cache+0x24/0x4e handle_mm_fault+0x626/0x7de find_vma+0x0/0x66 down_read+0x0/0xe wait_on_page_bit_killable_timeout+0x77/0x7c find_vma+0x16/0x66 do_page_fault+0xe6/0x23a res_func+0xa3c/0x141a buserr_c+0x190/0x6d4 res_func+0xa3c/0x141a buserr+0x20/0x28 res_func+0xa3c/0x141a buserr+0x20/0x28 The relationship is not obvious but it's due to a failure to rescan the full zonelist after the fair zone allocation policy exhausts the batch count. While this is a functional problem, it's also a performance issue. A page allocator microbenchmark showed the following 4.7.0-rc1 4.7.0-rc1 vanilla reset-v1r2 Min alloc-odr0-1 327.00 ( 0.00%) 326.00 ( 0.31%) Min alloc-odr0-2 235.00 ( 0.00%) 235.00 ( 0.00%) Min alloc-odr0-4 198.00 ( 0.00%) 198.00 ( 0.00%) Min alloc-odr0-8 170.00 ( 0.00%) 170.00 ( 0.00%) Min alloc-odr0-16 156.00 ( 0.00%) 156.00 ( 0.00%) Min alloc-odr0-32 150.00 ( 0.00%) 150.00 ( 0.00%) Min alloc-odr0-64 146.00 ( 0.00%) 146.00 ( 0.00%) Min alloc-odr0-128 145.00 ( 0.00%) 145.00 ( 0.00%) Min alloc-odr0-256 155.00 ( 0.00%) 155.00 ( 0.00%) Min alloc-odr0-512 168.00 ( 0.00%) 165.00 ( 1.79%) Min alloc-odr0-1024 175.00 ( 0.00%) 174.00 ( 0.57%) Min alloc-odr0-2048 180.00 ( 0.00%) 180.00 ( 0.00%) Min alloc-odr0-4096 187.00 ( 0.00%) 186.00 ( 0.53%) Min alloc-odr0-8192 190.00 ( 0.00%) 190.00 ( 0.00%) Min alloc-odr0-16384 191.00 ( 0.00%) 191.00 ( 0.00%) Min alloc-odr1-1 736.00 ( 0.00%) 445.00 ( 39.54%) Min alloc-odr1-2 343.00 ( 0.00%) 335.00 ( 2.33%) Min alloc-odr1-4 277.00 ( 0.00%) 270.00 ( 2.53%) Min alloc-odr1-8 238.00 ( 0.00%) 233.00 ( 2.10%) Min alloc-odr1-16 224.00 ( 0.00%) 218.00 ( 2.68%) Min alloc-odr1-32 210.00 ( 0.00%) 208.00 ( 0.95%) Min alloc-odr1-64 207.00 ( 0.00%) 203.00 ( 1.93%) Min alloc-odr1-128 276.00 ( 0.00%) 202.00 ( 26.81%) Min alloc-odr1-256 206.00 ( 0.00%) 202.00 ( 1.94%) Min alloc-odr1-512 207.00 ( 0.00%) 202.00 ( 2.42%) Min alloc-odr1-1024 208.00 ( 0.00%) 205.00 ( 1.44%) Min alloc-odr1-2048 213.00 ( 0.00%) 212.00 ( 0.47%) Min alloc-odr1-4096 218.00 ( 0.00%) 216.00 ( 0.92%) Min alloc-odr1-8192 341.00 ( 0.00%) 219.00 ( 35.78%) Note that order-0 allocations are unaffected but higher orders get a small boost from this patch and a large reduction in system CPU usage overall as can be seen here: 4.7.0-rc1 4.7.0-rc1 vanilla reset-v1r2 User 85.32 86.31 System 2221.39 2053.36 Elapsed 2368.89 2202.47 Fixes: `c33d6c06f6` ("mm, page_alloc: avoid looking up the first zone in a zonelist twice") Link: http://lkml.kernel.org/r/20160531100848.GR2527@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-06-03 16:02:56 -07:00
Michal Hocko	cbdcf7f789	mm, oom_reaper: do not use siglock in try_oom_reaper() Oleg has noted that siglock usage in try_oom_reaper is both pointless and dangerous. signal_group_exit can be checked lockless. The problem is that sighand becomes NULL in __exit_signal so we can crash. Fixes: `3ef22dfff2` ("oom, oom_reaper: try to reap tasks which skip regular OOM killer path") Link: http://lkml.kernel.org/r/1464679423-30218-1-git-send-email-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-06-03 16:02:56 -07:00
Vlastimil Babka	83b9355bf6	mm, page_alloc: prevent infinite loop in buffered_rmqueue() In DEBUG_VM kernel, we can hit infinite loop for order == 0 in buffered_rmqueue() when check_new_pcp() returns 1, because the bad page is never removed from the pcp list. Fix this by removing the page before retrying. Also we don't need to check if page is non-NULL, because we simply grab it from the list which was just tested for being non-empty. Fixes: `479f854a20` ("mm, page_alloc: defer debugging checks of pages allocated from the PCP") Link: http://lkml.kernel.org/r/20160530090154.GM2527@techsingularity.net Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-06-03 16:02:56 -07:00
Joe Perches	879be4f378	checkpatch: reduce git commit description style false positives Some lines in a commit log appear to be commit SHA1 ids like: ERROR: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 0123456789ab ("commit description")' Link: http://lkml.kernel.org/r/40e03fd7aaf1f55c75d787128d6d17c5a71226c2.1464358556.git.vdavydov@virtuozzo.com Reduce the false positives. Link: http://lkml.kernel.org/r/eda977eaa8328fef42bb3c87935d97e10ea8ff67.1464384023.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-06-03 16:02:56 -07:00
Vitaly Wool	43afc19417	mm/z3fold.c: avoid modifying HEADLESS page and minor cleanup Fix erroneous z3fold header access in a HEADLESS page in reclaim function, and change one remaining direct handle-to-buddy conversion to use the appropriate helper. Link: http://lkml.kernel.org/r/5748706F.9020208@gmail.com Signed-off-by: Vitaly Wool <vitalywool@gmail.com> Reviewed-by: Dan Streetman <ddstreet@ieee.org> Cc: Seth Jennings <sjenning@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-06-03 16:02:55 -07:00
Tejun Heo	3a06bb78ce	memcg: add RCU locking around css_for_each_descendant_pre() in memcg_offline_kmem() memcg_offline_kmem() may be called from memcg_free_kmem() after a css init failure. memcg_free_kmem() is a ->css_free callback which is called without cgroup_mutex and memcg_offline_kmem() ends up using css_for_each_descendant_pre() without any locking. Fix it by adding rcu read locking around it. mkdir: cannot create directory `65530': No space left on device =============================== [ INFO: suspicious RCU usage. ] 4.6.0-work+ #321 Not tainted ------------------------------- kernel/cgroup.c:4008 cgroup_mutex or RCU read lock required! [ 527.243970] other info that might help us debug this: [ 527.244715] rcu_scheduler_active = 1, debug_locks = 0 2 locks held by kworker/0:5/1664: #0: ("cgroup_destroy"){.+.+..}, at: [<ffffffff81060ab5>] process_one_work+0x165/0x4a0 #1: ((&css->destroy_work)#3){+.+...}, at: [<ffffffff81060ab5>] process_one_work+0x165/0x4a0 [ 527.248098] stack backtrace: CPU: 0 PID: 1664 Comm: kworker/0:5 Not tainted 4.6.0-work+ #321 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014 Workqueue: cgroup_destroy css_free_work_fn Call Trace: dump_stack+0x68/0xa1 lockdep_rcu_suspicious+0xd7/0x110 css_next_descendant_pre+0x7d/0xb0 memcg_offline_kmem.part.44+0x4a/0xc0 mem_cgroup_css_free+0x1ec/0x200 css_free_work_fn+0x49/0x5e0 process_one_work+0x1c5/0x4a0 worker_thread+0x49/0x490 kthread+0xea/0x100 ret_from_fork+0x1f/0x40 Link: http://lkml.kernel.org/r/20160526203018.GG23194@mtj.duckdns.org Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: <stable@vger.kernel.org> [4.5+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-06-03 16:02:55 -07:00
Linus Torvalds	2c22132563	Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer bugfix from Thomas Gleixner: "A single bugfix for the error check wreckage we introduced in the merge window" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: time: Make settimeofday error checking work again	2016-06-03 15:37:27 -07:00
Yang Shi	f86e427197	mm: check the return value of lookup_page_ext for all call sites Per the discussion with Joonsoo Kim [1], we need check the return value of lookup_page_ext() for all call sites since it might return NULL in some cases, although it is unlikely, i.e. memory hotplug. Tested with ltp with "page_owner=0". [1] http://lkml.kernel.org/r/20160519002809.GA10245@js1304-P5Q-DELUXE [akpm@linux-foundation.org: fix build-breaking typos] [arnd@arndb.de: fix build problems from lookup_page_ext] Link: http://lkml.kernel.org/r/6285269.2CksypHdYp@wuerfel [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1464023768-31025-1-git-send-email-yang.shi@linaro.org Signed-off-by: Yang Shi <yang.shi@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-06-03 15:06:22 -07:00
Corey Minyard	d8bae33ddd	kdump: fix dmesg gdbmacro to work with record based printk Commit `7ff9554bb5` ("printk: convert byte-buffer to variable-length record buffer") introduced a record based printk buffer. Modify gdbmacros.txt to parse this new structure so dmesg will work properly. Link: http://lkml.kernel.org/r/1463515794-1599-1-git-send-email-minyard@acm.org Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: Dave Young <dyoung@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-06-03 15:06:22 -07:00
Guillermo Julián Moreno	65ee03c4b9	mm: fix overflow in vm_map_ram() When remapping pages accounting for 4G or more memory space, the operation 'count << PAGE_SHIFT' overflows as it is performed on an integer. Solution: cast before doing the bitshift. [akpm@linux-foundation.org: fix vm_unmap_ram() also] [akpm@linux-foundation.org: fix vmap() as well, per Guillermo] Link: http://lkml.kernel.org/r/etPan.57175fb3.7a271c6b.2bd@naudit.es Signed-off-by: Guillermo Julián Moreno <guillermo.julian@naudit.es> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-06-03 15:06:22 -07:00
Linus Torvalds	e603330c86	Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm Pull ARM fix from Russell King: "Just one fix to the ptrace code, spotted by Simon Marchi, where if a thread migrates to a different CPU and the VFP registers are changed through ptrace, the application doesn't see the updated VFP registers" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: fix PTRACE_SETVFPREGS on SMP systems	2016-06-03 14:39:29 -07:00
Linus Torvalds	d29e472301	arm64 fixes: - Revert a previous revert and get hugetlb going with contiguous hints - Wire up missing compat syscalls - Enable CONFIG_SET_MODULE_RONX by default - Add missing line to our compat /proc/cpuinfo output - Clarify levels in our page table dumps - Fix booting with RANDOMIZE_TEXT_OFFSET enabled - Misc fixes to the ARM CPU PMU driver (refcounting, probe failure) - Remove some dead code and update a comment -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJXUZzdAAoJELescNyEwWM0Fm4H/215TJBXcohdIpTYX/Nr3+Bh 5jOqMgEWKe6ipmXCnzWpZ1F1GHoxLxndSWhJbi1Gwpdc+EYdXWxT5Z/IsQMKZXOb n81RzJbit6hXlebZ99YbEjfokfpMATno/9pD8sr1F2Xg/LsrD00O/0aPk9Fp38vX VRK2TlaVdBQbF80rMI6fUEEVksbJBqS+jBlqwkRri9JPusP+F5F4Qg8f4VqXDSef ZDEBzaI1BjrIQhDGSf0baXQ9EvuyotT2VR1NKAbYzdK8JoLTYiacIDviU2TCsuYZ ujB3MTOUytxAJbFhC2KBiDJfukSNY8vg+fLFg7rM3qUj6TMw6ESe1goYVScljHY= =cAlD -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The main thing here is reviving hugetlb support using contiguous ptes, which we ended up reverting at the last minute in 4.5 pending a fix which went into the core mm/ code during the recent merge window. - Revert a previous revert and get hugetlb going with contiguous hints - Wire up missing compat syscalls - Enable CONFIG_SET_MODULE_RONX by default - Add missing line to our compat /proc/cpuinfo output - Clarify levels in our page table dumps - Fix booting with RANDOMIZE_TEXT_OFFSET enabled - Misc fixes to the ARM CPU PMU driver (refcounting, probe failure) - Remove some dead code and update a comment" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: fix alignment when RANDOMIZE_TEXT_OFFSET is enabled arm64: move {PAGE,CONT}_SHIFT into Kconfig arm64: mm: dump: log span level arm64: update stale PAGE_OFFSET comment drivers/perf: arm_pmu: Avoid leaking pmu->irq_affinity on error drivers/perf: arm_pmu: Defer the setting of __oprofile_cpu_pmu drivers/perf: arm_pmu: Fix reference count of a device_node in of_pmu_irq_cfg arm64: report CPU number in bad_mode arm64: unistd32.h: wire up missing syscalls for compat tasks arm64: Provide "model name" in /proc/cpuinfo for PER_LINUX32 tasks arm64: enable CONFIG_SET_MODULE_RONX by default arm64: Remove orphaned __addr_ok() definition Revert "arm64: hugetlb: partial revert of 66b3923a1a0f"	2016-06-03 14:29:47 -07:00
Linus Torvalds	5306d766f1	powerpc fixes for 4.7 - Handle RTAS delay requests in configure_bridge from Russell Currey - Refactor the configure_bridge RTAS tokens from Russell Currey - Fix definition of SIAR and SDAR registers from Thomas Huth - Use privileged SPR number for MMCR2 from Thomas Huth - Update LPCR only if it is powernv from Aneesh Kumar K.V - Fix the reference bit update when handling hash fault from Aneesh Kumar K.V - Add missing tlb flush from Aneesh Kumar K.V - Add POWER8NVL support to ibm,client-architecture-support call from Thomas Huth -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXUWmzAAoJEFHr6jzI4aWAuEAQAKAWjT4JCIQMzc2Qk0LbBFhD Mw3xIt618g0PLRLN431rqk9xE1phh9ecSNt40kQV0UYypCwQyYaOzAAHq7QibRFh wFDLyrO10aiqTVBI9Dl7l6ZDespGP4ufOyj4GF3vIndTQ8CLT6+Jj/cfaT8lNIna vOp0Q8RKoSuqJu7wOZwxGHc9vhdpDftBOi2Yl2KHWrWUMTVtyDzgRwmFiCk/9SvX +xYVGfbCRkU1b05X9RgbsbMdiAzC7odzAC8e4VNHqB+YHgBFenlOKcx6uuz5Oubv /0dGJq1OEKxKgkEWZsXMhfcezI1i6H1qa5gwqO2pJn/mQiWUx/IxxC5j1tGMcgXJ tU4cJtMccUmB22bOTqJsGhQJyHh09UDql2PIAHdvTZplYBj34R+2fr6LMRC5TUyY yXsk/26CAh1ihPK8juytK8D2f7CfFJVe96HXMz8yGWNy2MeaJHh3Nz7E2/hA6L6B GzT08EBGmGU/1/L4xJi1uuRLQXMgvWkuo0C65m9jqFah7y82gHetypTBSbLOBcUX +pJRcWouZlksbEHydMlbvNrqmJz1zPx9JJFJJrfDzkYUB/O61NxOpaQPZHfhMt01 Bd8n0wt7OPW9bZL9WrW5W11OhOxNZfuxcr5GmVnzsaALPazFThQm1gZH2NOp6Wr/ 9aWy/lZAuDOrtOnmfcZm =OkUs -----END PGP SIGNATURE----- Merge tag 'powerpc-4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Handle RTAS delay requests in configure_bridge from Russell Currey - Refactor the configure_bridge RTAS tokens from Russell Currey - Fix definition of SIAR and SDAR registers from Thomas Huth - Use privileged SPR number for MMCR2 from Thomas Huth - Update LPCR only if it is powernv from Aneesh Kumar K.V - Fix the reference bit update when handling hash fault from Aneesh Kumar K.V - Add missing tlb flush from Aneesh Kumar K.V - Add POWER8NVL support to ibm,client-architecture-support call from Thomas Huth * tag 'powerpc-4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries: Add POWER8NVL support to ibm,client-architecture-support call powerpc/mm/radix: Add missing tlb flush powerpc/mm/hash: Fix the reference bit update when handling hash fault powerpc/mm/radix: Update LPCR only if it is powernv powerpc: Use privileged SPR number for MMCR2 powerpc: Fix definition of SIAR and SDAR registers powerpc/pseries/eeh: Refactor the configure_bridge RTAS tokens powerpc/pseries/eeh: Handle RTAS delay requests in configure_bridge	2016-06-03 14:20:22 -07:00

... 6 7 8 9 10 ...

602372 Commits