linux

Author	SHA1	Message	Date
Shawn Lin	de8473f514	PCI: rockchip: Factor out rockchip_pcie_deinit_phys() Factor out rockchip_pcie_deinit_phys() so it can be reused by rockchip_pcie_suspend_noirq() and rockchip_pcie_remove(). No functional change intended. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-08-29 13:18:07 -05:00
Shawn Lin	41b70b2c6f	PCI: rockchip: Factor out rockchip_pcie_disable_clocks() Factor out rockchip_pcie_disable_clocks() so it can be reused by other functions. No functional change intended, but it does change the order of unpreparing clocks in the rockchip_pcie_resume_noirq() error path so it matches the other paths. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-08-29 13:18:07 -05:00
Shawn Lin	09df7bc40a	PCI: rockchip: Factor out rockchip_pcie_enable_clocks() Factor out rockchip_pcie_enable_clocks() so it can be reused by rockchip_pcie_resume_noirq() and rockchip_pcie_probe(). No functional change intended, but it does change the order of unpreparing clocks in the rockchip_pcie_resume_noirq() error path. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-08-29 13:18:07 -05:00
Shawn Lin	6341f8052e	PCI: rockchip: Factor out rockchip_pcie_setup_irq() Factor out rockchip_pcie_setup_irq() to prepare for future bug fixes. No functional change intended. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-08-29 13:18:06 -05:00
Fabio Estevam	bf2b3312ed	PCI: rockchip: Use gpiod_set_value_cansleep() to allow reset via expanders The reset GPIO can be connected to a I2C or SPI IO expander, which may sleep, so it is safer to use the gpiod_set_value_cansleep() variant instead. Signed-off-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com>	2017-08-29 13:18:06 -05:00
Paul Burton	62f9ee98e1	PCI: rockchip: Use PCI_NUM_INTX Use the PCI_NUM_INTX macro to indicate the number of PCI INTx interrupts rather than the magic number 4. This makes it clearer where the number comes from & what it relates to. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Shawn Lin <shawn.lin@rock-chips.com>	2017-08-29 13:18:06 -05:00
Philipp Zabel	18aca19722	PCI: rockchip: Explicitly request exclusive reset control Commit `a53e35db70` ("reset: Ensure drivers are explicit when requesting reset lines") started to transition the reset control request API calls to explicitly state whether the driver needs exclusive or shared reset control behavior. Convert all drivers requesting exclusive resets to the explicit API call so the temporary transition helpers can be removed. No functional changes. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com>	2017-08-29 13:18:06 -05:00
Shawn Lin	05b57273ac	dt-bindings: phy-rockchip-pcie: Convert to per-lane PHY model Deprecate the legacy Rockchip PCIe PHY and encourage users to use per-lane PHY mode by setting #phy-cells to 1. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Acked-by: Rob Herring <robh@kernel.org>	2017-08-29 13:18:05 -05:00
Shawn Lin	7a55b57031	dt-bindings: PCI: rockchip: Convert to per-lane PHY model Deprecate legacy PHY model and encourage per-lane PHY model. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Acked-by: Rob Herring <robh@kernel.org>	2017-08-29 13:18:05 -05:00
Shawn Lin	e9a60cac89	arm64: dts: rockchip: convert PCIe to use per-lane PHYs for rk3339 Convert all RK3399 platforms to use per-lane PHY model in order to save more power by idling unused lane(s). Tested-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org>	2017-08-29 13:18:05 -05:00
Linus Torvalds	36fde05f3f	Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "A late but obvious fix for cgroup. I broke the 'cpuset.memory_pressure' file a long time ago (v4.4) by accidentally deleting its file index, which made it a duplicate of the 'cpuset.memory_migrate' file. Spotted and fixed by Waiman" * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: Fix incorrect memory_pressure control file mapping	2017-08-29 11:16:21 -07:00
Linus Torvalds	31a3faf322	Merge branch 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "Late fixes for libata. There's a minor platform driver fix but the important one is READ LOG PAGE. This is a new ATA command which is used to test some optional features but it broke probing of some devices - they locked up instead of failing the unknown command. Christoph tried blacklisting, but, after finding out there are multiple devices which fail this way, backed off to testing feature bit in IDENTIFY data first, which is a bit lossy (we can miss features on some devices) but should be a lot safer" * 'for-4.13-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: Revert "libata: quirk read log on no-name M.2 SSD" libata: check for trusted computing in IDENTIFY DEVICE data libata: quirk read log on no-name M.2 SSD sata: ahci-da850: Fix some error handling paths in 'ahci_da850_probe()'	2017-08-29 11:13:52 -07:00
David S. Miller	7619de85d0	Merge tag 'wireless-drivers-next-for-davem-2017-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.14 rsi driver is getting a lot of new features lately, but as usual active development happening on iwlwifi as well as other drivers. I pulled wireless-drivers to fix multiple conflicts in iwlwifi and to make it easier further development. Major changes: ath10k * initial UBS bus support (no full support yet) * add tdls support for 10.4 firmware ath9k * add Dell Wireless 1802 wil6210 * support FW RSSI reporting rsi * support legacy power save, U-APSD, rf-kill and AP mode * RTS threshold configuration brcmfmac * support CYW4373 SDIO/USB chipset iwlwifi * some more code moved to a new directory * add new PCI ID for 7265D ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 11:04:43 -07:00
Arvind Yadav	22eac913fe	net: stmmac: constify clk_div_table clk_div_table are not supposed to change at runtime. meson8b_dwmac structure is working with const clk_div_table. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:56:42 -07:00
Xin Long	e8d411d298	ipv6: do not set sk_destruct in IPV6_ADDRFORM sockopt ChunYu found a kernel warn_on during syzkaller fuzzing: [40226.038539] WARNING: CPU: 5 PID: 23720 at net/ipv4/af_inet.c:152 inet_sock_destruct+0x78d/0x9a0 [40226.144849] Call Trace: [40226.147590] <IRQ> [40226.149859] dump_stack+0xe2/0x186 [40226.176546] __warn+0x1a4/0x1e0 [40226.180066] warn_slowpath_null+0x31/0x40 [40226.184555] inet_sock_destruct+0x78d/0x9a0 [40226.246355] __sk_destruct+0xfa/0x8c0 [40226.290612] rcu_process_callbacks+0xaa0/0x18a0 [40226.336816] __do_softirq+0x241/0x75e [40226.367758] irq_exit+0x1f6/0x220 [40226.371458] smp_apic_timer_interrupt+0x7b/0xa0 [40226.376507] apic_timer_interrupt+0x93/0xa0 The warn_on happned when sk->sk_rmem_alloc wasn't 0 in inet_sock_destruct. As after commit `f970bd9e3a` ("udp: implement memory accounting helpers"), udp has changed to use udp_destruct_sock as sk_destruct where it would udp_rmem_release all rmem. But IPV6_ADDRFORM sockopt sets sk_destruct with inet_sock_destruct after changing family to PF_INET. If rmem is not 0 at that time, and there is no place to release rmem before calling inet_sock_destruct, the warn_on will be triggered. This patch is to fix it by not setting sk_destruct in IPV6_ADDRFORM sockopt any more. As IPV6_ADDRFORM sockopt only works for tcp and udp. TCP sock has already set it's sk_destruct with inet_sock_destruct and UDP has set with udp_destruct_sock since they're created. Fixes: `f970bd9e3a` ("udp: implement memory accounting helpers") Reported-by: ChunYu Wang <chunwang@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:54:40 -07:00
David S. Miller	25d4dae1a6	Merge branch 'XDP-redirect-tracepoints' Jesper Dangaard Brouer says: ==================== XDP redirect tracepoints I feel this is as far as I can take the tracepoint infrastructure to assist XDP monitoring. Tracepoints comes with a base overhead of 25 nanosec for an attached bpf_prog, and 48 nanosec for using a full perf record. This is problematic for the XDP use-case, but it is very convenient to use the existing perf infrastructure. From a performance perspective, the real solution would be to attach another bpf_prog (that understand xdp_buff), but I'm not sure we want to introduce yet another bpf attach API for this. One thing left is to standardize the possible err return codes, to a limited set, to allow easier (and faster) mapping into a bpf map. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:51:29 -07:00
Jesper Dangaard Brouer	3ffab54602	samples/bpf: xdp_monitor tool based on tracepoints This tool xdp_monitor demonstrate how to use the different xdp_redirect tracepoints xdp_redirect{,_map}{,_err} from a BPF program. The default mode is to only monitor the error counters, to avoid affecting the per packet performance. Tracepoints comes with a base overhead of 25 nanosec for an attached bpf_prog, and 48 nanosec for using a full perf record (with non-matching filter). Thus, default loading the --stats mode could affect the maximum performance. This version of the tool is very simple and count all types of errors as one. It will be natural to extend this later with the different types of errors that can occur, which should help users quickly identify common mistakes. Because the TP_STRUCT was kept in sync all the tracepoints loads the same BPF code. It would also be natural to extend the map version to demonstrate how the map information could be used. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:51:29 -07:00
Jesper Dangaard Brouer	306da4e685	samples/bpf: xdp_redirect load XDP dummy prog on TX device For supporting XDP_REDIRECT, a device driver must (obviously) implement the "TX" function ndo_xdp_xmit(). An additional requirement is you cannot TX out a device, unless it also have a xdp bpf program attached. This dependency is caused by the driver code need to setup XDP resources before it can ndo_xdp_xmit. Update bpf samples xdp_redirect and xdp_redirect_map to automatically attach a dummy XDP program to the configured ifindex_out device. Use the XDP flag XDP_FLAGS_UPDATE_IF_NOEXIST on the dummy load, to avoid overriding an existing XDP prog on the device. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:51:29 -07:00
Jesper Dangaard Brouer	59a3089675	xdp: separate xdp_redirect tracepoint in map case Creating as specific xdp_redirect_map variant of the xdp tracepoints allow users to write simpler/faster BPF progs that get attached to these tracepoints. Goal is to still keep the tracepoints in xdp_redirect and xdp_redirect_map similar enough, that a tool can read the top part of the TP_STRUCT and produce similar monitor statistics. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:51:29 -07:00
Jesper Dangaard Brouer	f5836ca5e9	xdp: separate xdp_redirect tracepoint in error case There is a need to separate the xdp_redirect tracepoint into two tracepoints, for separating the error case from the normal forward case. Due to the extreme speeds XDP is operating at, loading a tracepoint have a measurable impact. Single core XDP REDIRECT (ethtool tuned rx-usecs 25) can do 13.7 Mpps forwarding, but loading a simple bpf_prog at the tracepoint (with a return 0) reduce perf to 10.2 Mpps (CPU E5-1650 v4 @ 3.60GHz, driver: ixgbe) The overhead of loading a bpf-based tracepoint can be calculated to cost 25 nanosec ((1/13782002-1/10267937)10^9 = -24.83 ns). Using perf record on the tracepoint event, with a non-matching --filter expression, the overhead is much larger. Performance drops to 8.3 Mpps, cost 48 nanosec ((1/13782002-1/8312497)10^9 = -47.74)) Having a separate tracepoint for err cases, which should be less frequent, allow running a continuous monitor for errors while not affecting the redirect forward performance (this have also been verified by measurements). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:51:29 -07:00
Jesper Dangaard Brouer	b06337dfdb	xdp: make xdp tracepoints report bpf prog id instead of prog_tag Given previous patch expose the map_id, it seems natural to also report the bpf prog id. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:51:29 -07:00
Jesper Dangaard Brouer	8d3b778ff5	xdp: tracepoint xdp_redirect also need a map argument To make sense of the map index, the tracepoint user also need to know that map we are talking about. Supply the map pointer but only expose the map->id. The 'to_index' is renamed 'to_ifindex'. In the xdp_redirect_map case, this is the result of the devmap lookup. The map lookup key is exposed as map_index, which is needed to troubleshoot in case the lookup failed. The 'to_ifindex' is placed after 'err' to keep TP_STRUCT as common as possible. This also keeps the TP_STRUCT similar enough, that userspace can write a monitor program, that doesn't need to care about whether bpf_redirect or bpf_redirect_map were used. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:51:28 -07:00
Jesper Dangaard Brouer	c31e5a4876	xdp: remove redundant argument to trace_xdp_redirect Supplying the action argument XDP_REDIRECT to the tracepoint xdp_redirect is redundant as it is only called in-case this action was specified. Remove the argument, but keep "act" member of the tracepoint struct and populate it with XDP_REDIRECT. This makes it easier to write a common bpf_prog processing events. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 10:51:28 -07:00
Baruch Siach	c27d88e10b	dt-binding: net/phy: fix interrupts description Commit `b053dc5a72` (powerpc: Refactor device tree binding) split the Ethernet PHY binding documentation out of the big booting-without-of.txt file, leaving a dangling reference to "section 2" in the 'interrupts' property description. Drop that reference, and make the description look more like the rest. While at it, make the example interrupt-parent phandle look more like a real world phandle, and use an IRQ_TYPE_ macro for the 'interrupts' type. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Rob Herring <robh@kernel.org>	2017-08-29 12:13:40 -05:00
Christoph Hellwig	c529594f93	bsg: remove #if 0'ed code Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-29 10:50:30 -06:00
Ben Hutchings	7de967e76f	mq-deadline: Enable auto-loading when built as module The block core requests modules with the "-iosched" name suffix, but mq-deadline does not have that suffix. Add an alias. Fixes: `945ffb60c1` ("mq-deadline: add blk-mq adaptation of the deadline ...") Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-29 10:47:23 -06:00
Ben Hutchings	26b4cf2497	bfq: Re-enable auto-loading when built as a module The block core requests modules with the "-iosched" name suffix, but bfq no longer has that suffix. Add an alias. Fixes: `ea25da4808` ("block, bfq: split bfq-iosched.c into multiple ...") Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-29 10:47:23 -06:00
David S. Miller	d0fcece770	Merge tag 'rxrpc-next-20170829' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Miscellany Here are a number of patches that make some changes/fixes and add a couple of extensions to AF_RXRPC for kernel services to use. The changes and fixes are: (1) Use time64_t rather than u32 outside of protocol or UAPI-representative structures. (2) Use the correct time stamp when loading a key from an XDR-encoded Kerberos 5 key. (3) Fix IPv6 support. (4) Fix some places where the error code is being incorrectly made positive before returning. (5) Remove some white space. And the extensions: (6) Add an end-of-Tx phase notification, thereby allowing kAFS to transition the state on its own call record at the correct point, rather than having to do it in advance and risk non-completion of the call in the wrong state. (7) Allow a kernel client call to be retried if it fails on a network error, thereby making it possible for kAFS to iterate over a number of IP addresses without having to reload the Tx queue and re-encrypt data each time. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 09:42:48 -07:00
David S. Miller	3d86e352c0	Merge branch 'addrlabel-no-rtnl-locking' Florian Westphal says: ==================== addrlabel: don't use rtnl locking addrlabel doesn't appear to require rtnl lock as the addrlabel table uses a spinlock to serialize add/delete operations. Also, entries are reference counted so it should be safe to call the rtnl ops without the rtnl mutex. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 09:41:56 -07:00
Florian Westphal	a6f57028d6	addrlabel: add/delete/get can run without rtnl There appears to be no need to use rtnl, addrlabel entries are refcounted and add/delete is serialized by the addrlabel table spinlock. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 09:41:56 -07:00
Florian Westphal	34504029b5	selftests: add addrlabel add/delete to rtnetlink.sh Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 09:41:56 -07:00
David S. Miller	04f1c4ad72	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2017-08-29 1) Fix dst_entry refcount imbalance when using socket policies. From Lorenzo Colitti. 2) Fix locking when adding the ESP trailers. 3) Fix tailroom calculation for the ESP trailer by using skb_tailroom instead of skb_availroom. 4) Fix some info leaks in xfrm_user. From Mathias Krause. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 09:37:06 -07:00
Greg Kroah-Hartman	6c766db60b	staging: irda: update MAINTAINERS Now that the IRDA code has moved under drivers/staging/irda/, update the MAINTAINERS file with the new location. Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 09:30:00 -07:00
Sathya Perla	f143647a02	bnxt_en: add a dummy definition for bnxt_vf_rep_get_fid() When bnxt VF-reps are not compiled in (CONFIG_BNXT_SRIOV is off) bnxt_tc.c needs a dummy definition of the routine bnxt_vf_rep_get_fid(). Reported-by: kbuild test robot <fengguang.wu@intel.com> Fixes: `2ae7408fed` ("bnxt_en: bnxt: add TC flower filter offload support") Signed-off-by: Sathya Perla <sathya.perla@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-08-29 09:28:42 -07:00
Lothar Waßmann	2d2a2b8c08	mtd: nand: complain loudly when chip->bits_per_cell is not correctly initialized chip->bits_per_cell which is used to determine the NAND cell type (SLC/MLC) should always have a value != 0. Complain loudly if the value is 0 in nand_is_slc() to catch use before correct initialization. Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>	2017-08-29 18:23:50 +02:00
Lothar Waßmann	69fc01296c	mtd: nand: make Samsung SLC NAND usable again commit `c51d0ac59f` ("mtd: nand: Move Samsung specific init/detection logic in nand_samsung.c") introduced a regression for Samsung SLC NAND chips. Prior to this commit chip->bits_per_cell was initialized by calling nand_get_bits_per_cell() before using nand_is_slc(). With the offending commit this call is skipped, leaving chip->bits_per_cell cleared to zero when the manufacturer specific '.detect' function calls nand_is_slc() which in turn interprets bits_per_cell != 1 as indication for an MLC chip. The effect is that e.g. a K9F1G08U0F NAND chip is falsely detected as MLC NAND with 4KiB page size rather than SLC with 2KiB page size. Add a call to nand_get_bits_per_cell() before calling the .detect hook function in nand_manufacturer_detect(), so that the nand_is_slc() calls in the manufacturer specific code will return correct results. Fixes: `c51d0ac59f` ("mtd: nand: Move Samsung specific init/detection logic in nand_samsung.c") Cc: <stable@vger.kernel.org> Signed-off-by: Lothar Waßmann <LW@KARO-electronics.de> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>	2017-08-29 18:22:33 +02:00
Linus Torvalds	785373b4c3	Revert "rmap: do not call mmu_notifier_invalidate_page() under ptl" This reverts commit `aac2fea94f`. It turns out that that patch was complete and utter garbage, and broke KVM, resulting in odd oopses. Quoting Andrea Arcangeli: "The aforementioned commit has 3 bugs. 1) mmu_notifier_invalidate_range cannot be used in replacement of mmu_notifier_invalidate_range_start/end. For KVM mmu_notifier_invalidate_range is a noop and rightfully so. A MMU notifier implementation has to implement either ->invalidate_range method or the invalidate_range_start/end methods, not both. And if you implement invalidate_range_start/end like KVM is forced to do, calling mmu_notifier_invalidate_range in common code is a noop for KVM. For those MMU notifiers that can get away only implementing ->invalidate_range, the ->invalidate_range is implicitly called by mmu_notifier_invalidate_range_end(). And only those secondary MMUs that share the same pagetable with the primary MMU (like AMD iommuv2) can get away only implementing ->invalidate_range. So all cases (THP on/off) are broken right now. To fix this is enough to replace mmu_notifier_invalidate_range with mmu_notifier_invalidate_range_start;mmu_notifier_invalidate_range_end. Either that or call multiple mmu_notifier_invalidate_page like before. 2) address + (1UL << compound_order(page) is buggy, it should be PAGE_SIZE << compound_order(page), it's bytes not pages, 2M not 512. 3) The whole invalidate_range thing was an attempt to call a single invalidate while walking multiple 4k ptes that maps the same THP (after a pmd virtual split without physical compound page THP split). It's unclear if the rmap_walk will always provide an address that is 2M aligned as parameter to try_to_unmap_one, in presence of THP. I think it needs also an address &= (PAGE_SIZE << compound_order(page)) - 1 to be safe" In general, we should stop making excuses for horrible MMU notifier users. It's much more important that the core VM is sane and safe, than letting MMU notifiers sleep. So if some MMU notifier is sleeping under a spinlock, we need to fix the notifier, not try to make excuses for that garbage in the core VM. Reported-and-tested-by: Bernhard Held <berny156@gmx.de> Reported-and-tested-by: Adam Borowski <kilobyte@angband.pl> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Wanpeng Li <kernellwp@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: axie <axie@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-08-29 09:11:06 -07:00
Damien Le Moal	5034435c84	block: Make blk_dequeue_request() static The only caller of this function is blk_start_request() in the same file. Fix blk_start_request() description accordingly. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-29 09:49:31 -06:00
Bart Van Assche	6fd5b91dab	skd: Let the block layer core choose .nr_requests Since blk_mq_init_queue() initializes .nr_requests to the tag set size and since that value is a good default for the skd driver, do not overwrite the value set by blk_mq_init_queue(). This change doubles the default value of .nr_requests. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-29 09:43:06 -06:00
Bart Van Assche	bf231981be	skd: Remove blk_queue_bounce_limit() call Since sTec s1120 devices support 64-bit DMA it is not necessary to request data buffer bouncing. Hence remove the blk_queue_bounce_limit() call. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-08-29 09:43:03 -06:00
Tejun Heo	2aca392398	Revert "libata: quirk read log on no-name M.2 SSD" This reverts commit `35f0b6a779`. We now conditionalize issuing of READ LOG PAGE on the TRUSTED COMPUTING SUPPORTED bit in the identity data and this shouldn't be necessary. Signed-off-by: Tejun Heo <tj@kernel.org>	2017-08-29 08:36:58 -07:00
Christoph Hellwig	e8f11db956	libata: check for trusted computing in IDENTIFY DEVICE data ATA-8 and later mirrors the TRUSTED COMPUTING SUPPORTED bit in word 48 of the IDENTIFY DEVICE data. Check this before issuing a READ LOG PAGE command to avoid issues with buggy devices. The only downside is that we can't support Security Send / Receive for a device with an older revision due to the conflicting use of this field in earlier specifications. tj: The reason we need this is because some devices which don't support READ LOG PAGE lock up after getting issued that command. Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-08-29 08:33:24 -07:00
Jens Axboe	2b76da9563	Merge branch 'nvme-4.14' of git://git.infradead.org/nvme into for-4.14/block-postmerge Pull NVMe changes from Christoph: "Below is the current set of NVMe updates for Linux 4.14, now against your postmerge branch, and with three more patches. The biggest bit comes from Sagi and refactors the RDMA driver to prepare for more code sharing in the setup and teardown path. But we have various features and bug fixes from a lot of people as well."	2017-08-29 09:09:11 -06:00
David Hildenbrand	1935222dc2	KVM: s390: we are always in czam mode Independent of the underlying hardware, kvm will now always handle SIGP SET ARCHITECTURE as if czam were enabled. Therefore, let's not only forward that bit but always set it. While at it, add a comment regarding STHYI. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170829143108.14703-1-david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2017-08-29 16:50:20 +02:00
Li Bin	b2f7605076	perf symbols: Fix plt entry calculation for ARM and AARCH64 On x86, the plt header size is as same as the plt entry size, and can be identified from shdr's sh_entsize of the plt. But we can't assume that the sh_entsize of the plt shdr is always the plt entry size in all architecture, and the plt header size may be not as same as the plt entry size in some architecure. On ARM, the plt header size is 20 bytes and the plt entry size is 12 bytes (don't consider the FOUR_WORD_PLT case) that refer to the binutils implementation. The plt section is as follows: Disassembly of section .plt: 000004a0 <__cxa_finalize@plt-0x14>: 4a0: e52de004 push {lr} ; (str lr, [sp, #-4]!) 4a4: e59fe004 ldr lr, [pc, #4] ; 4b0 <_init+0x1c> 4a8: e08fe00e add lr, pc, lr 4ac: e5bef008 ldr pc, [lr, #8]! 4b0: 00008424 .word 0x00008424 000004b4 <__cxa_finalize@plt>: 4b4: e28fc600 add ip, pc, #0, 12 4b8: e28cca08 add ip, ip, #8, 20 ; 0x8000 4bc: e5bcf424 ldr pc, [ip, #1060]! ; 0x424 000004c0 <printf@plt>: 4c0: e28fc600 add ip, pc, #0, 12 4c4: e28cca08 add ip, ip, #8, 20 ; 0x8000 4c8: e5bcf41c ldr pc, [ip, #1052]! ; 0x41c On AARCH64, the plt header size is 32 bytes and the plt entry size is 16 bytes. The plt section is as follows: Disassembly of section .plt: 0000000000000560 <__cxa_finalize@plt-0x20>: 560: a9bf7bf0 stp x16, x30, [sp,#-16]! 564: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8> 568: f944be11 ldr x17, [x16,#2424] 56c: 9125e210 add x16, x16, #0x978 570: d61f0220 br x17 574: d503201f nop 578: d503201f nop 57c: d503201f nop 0000000000000580 <__cxa_finalize@plt>: 580: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8> 584: f944c211 ldr x17, [x16,#2432] 588: 91260210 add x16, x16, #0x980 58c: d61f0220 br x17 0000000000000590 <__gmon_start__@plt>: 590: 90000090 adrp x16, 10000 <__FRAME_END__+0xf8a8> 594: f944c611 ldr x17, [x16,#2440] 598: 91262210 add x16, x16, #0x988 59c: d61f0220 br x17 NOTES: In addition to ARM and AARCH64, other architectures, such as s390/alpha/mips/parisc/poperpc/sh/sparc/xtensa also need to consider this issue. Signed-off-by: Li Bin <huawei.libin@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexis Berlemont <alexis.berlemont@gmail.com> Cc: David Tolnay <dtolnay@gmail.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: zhangmengting@huawei.com Link: http://lkml.kernel.org/r/1496622849-21877-1-git-send-email-huawei.libin@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-08-29 11:41:27 -03:00
Jens Axboe	015a2f823e	Merge branch 'stable/for-jens-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus Pull xen-blkback fix from Konrad: "[...] A bug-fix when shutting down xen block backend driver with multiple queues and the driver not clearing all of them."	2017-08-29 08:32:58 -06:00
Christian Borntraeger	fa41ba0d08	s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangs Right now there is a potential hang situation for postcopy migrations, if the guest is enabling storage keys on the target system during the postcopy process. For storage key virtualization, we have to forbid the empty zero page as the storage key is a property of the physical page frame. As we enable storage key handling lazily we then drop all mappings for empty zero pages for lazy refaulting later on. This does not work with the postcopy migration, which relies on the empty zero page never triggering a fault again in the future. The reason is that postcopy migration will simply read a page on the target system if that page is a known zero page to fault in an empty zero page. At the same time postcopy remembers that this page was already transferred - so any future userfault on that page will NOT be retransmitted again to avoid races. If now the guest enters the storage key mode while in postcopy, we will break this assumption of postcopy. The solution is to disable the empty zero page for KVM guests early on and not during storage key enablement. With this change, the postcopy migration process is guaranteed to start after no zero pages are left. As guest pages are very likely not empty zero pages anyway the memory overhead is also pretty small. While at it this also adds proper page table locking to the zero page removal. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-08-29 16:31:27 +02:00
Jan Höppner	28b841b3a7	s390/dasd: Add discard support for FBA devices The z/VM hypervisor provides virtual disks (VDISK) which are backed by main memory of the hypervisor. Those devices are seen as DASD FBA disks within the Linux guest. Whenever data is written to such a device, memory is allocated on-the-fly by z/VM accordingly. This memory, however, is not being freed if data on the device is deleted by the guest OS. In order to make memory usable after deletion again, add discard support to the FBA discipline. While at it, update comments regarding the DASD_FEATURE_* flags. Reviewed-by: Stefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-08-29 16:31:26 +02:00
Bhumika Goyal	8b94dd9e0d	s390/zcrypt: make CPRBX const Make this const as it is only used in a copy operation. Signed-off-by: Bhumika Goyal <bhumirks@gmail.com> Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-08-29 16:31:25 +02:00
Martin Schwidefsky	d66bf801e0	s390/uaccess: avoid mvcos jump label If the kernel is compiled for z10 or later machines the uaccess code inlines the mvcos instruction. The facility bit 27 which indicates the availability of MVCOS has to be set. The have_mvcos jump label will always be true. Make the generation of the have_mvcos jump label conditional on !CONFIG_HAVE_MARCH_Z10_FEATURES. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-08-29 16:29:10 +02:00

... 39 40 41 42 43 ...

704772 Commits