linux

Author	SHA1	Message	Date
Linus Torvalds	541efb7632	xen: features and fixes for 4.9-rc0 - Switch to new CPU hotplug mechanism. - Support driver_override in pciback. - Require vector callback for HVM guests (the alternate mechanism via the platform device has been broken for ages). -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJX9lkNAAoJEFxbo/MsZsTRnd8IAKnCH9pd2c1GgAPse2s8yBUL jh/nTQh+niVvpFA9elpfz+TrAIu4P0KLcnx6jhZ0Uv+Cmeaz5Ps+IaqyXBmqmeCm hjrnDo6wEVB/1LMtzibNk0hQcIN73MUEIfUESjl1iiIw3lPDPMIihMbpCAzVzaRf M8sInTTwcx0A9njUijEwT1wKV45hM7bpnAufChkxk3V3G2+JxBDYAQJCfW0u1DjR WFpbGKyNetXSVSf6QVZhW+lTnqTAUk0a5IqOg6UbzzbsHM7KgzwxB0FXYMRsL8jV 3VNiRJovNy+0F3T1VewPXWFlWs+QFK1GH0Hbncc5kUATNBm/VOjNt8H0dwUlfLM= =n1rz -----END PGP SIGNATURE----- Merge tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "xen features and fixes for 4.9: - switch to new CPU hotplug mechanism - support driver_override in pciback - require vector callback for HVM guests (the alternate mechanism via the platform device has been broken for ages)" * tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/x86: Update topology map for PV VCPUs xen/x86: Initialize per_cpu(xen_vcpu, 0) a little earlier xen/pciback: support driver_override xen/pciback: avoid multiple entries in slot list xen/pciback: simplify pcistub device handling xen: Remove event channel notification through Xen PCI platform device xen/events: Convert to hotplug state machine xen/x86: Convert to hotplug state machine x86/xen: add missing \n at end of printk warning message xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc() xen: Make VPMU init message look less scary xen: rename xen_pmu_init() in sys-hypervisor.c hotplug: Prevent alloc/free of irq descriptors during cpu up/down (again) xen/x86: Move irq allocation from Xen smp_op.cpu_up()	2016-10-06 11:19:10 -07:00
Juergen Gross	b057878b2a	xen/pciback: support driver_override Support the driver_override scheme introduced with commit `782a985d7a` ("PCI: Introduce new device binding path using pci_dev.driver_override") As pcistub_probe() is called for all devices (it has to check for a match based on the slot address rather than device type) it has to check for driver_override set to "pciback" itself. Up to now for assigning a pci device to pciback you need something like: echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind echo 0000:07:10.0 > /sys/bus/pci/drivers/pciback/new_slot echo 0000:07:10.0 > /sys/bus/pci/drivers_probe while with the patch you can use the same mechanism as for similar drivers like pci-stub and vfio-pci: echo pciback > /sys/bus/pci/devices/0000\:07\:10.0/driver_override echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind echo 0000:07:10.0 > /sys/bus/pci/drivers_probe So e.g. libvirt doesn't need special handling for pciback. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-09-30 11:49:02 +01:00
Juergen Gross	9f8bee9c98	xen/pciback: avoid multiple entries in slot list The Xen pciback driver has a list of all pci devices it is ready to seize. There is no check whether a to be added entry already exists. While this might be no problem in the common case it might confuse those which consume the list via sysfs. Modify the handling of this list by not adding an entry which already exists. As this will be needed later split out the list handling into a separate function. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-09-30 11:48:55 +01:00
Juergen Gross	1af916b701	xen/pciback: simplify pcistub device handling The Xen pciback driver maintains a list of all its seized devices. There are two functions searching the list for a specific device with basically the same semantics just returning different structures in case of a match. Split out the search function. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-09-30 11:48:33 +01:00
KarimAllah Ahmed	72a9b18629	xen: Remove event channel notification through Xen PCI platform device Ever since commit `254d1a3f02` ("xen/pv-on-hvm kexec: shutdown watches from old kernel") using the INTx interrupt from Xen PCI platform device for event channel notification would just lockup the guest during bootup. postcore_initcall now calls xs_reset_watches which will eventually try to read a value from XenStore and will get stuck on read_reply at XenBus forever since the platform driver is not probed yet and its INTx interrupt handler is not registered yet. That means that the guest can not be notified at this moment of any pending event channels and none of the per-event handlers will ever be invoked (including the XenStore one) and the reply will never be picked up by the kernel. The exact stack where things get stuck during xenbus_init: -xenbus_init -xs_init -xs_reset_watches -xenbus_scanf -xenbus_read -xs_single -xs_single -xs_talkv Vector callbacks have always been the favourite event notification mechanism since their introduction in commit `38e20b07ef` ("x86/xen: event channels delivery on HVM.") and the vector callback feature has always been advertised for quite some time by Xen that's why INTx was broken for several years now without impacting anyone. Luckily this also means that event channel notification through INTx is basically dead-code which can be safely removed without impacting anybody since it has been effectively disabled for more than 4 years with nobody complaining about it (at least as far as I'm aware of). This commit removes event channel notification through Xen PCI platform device. Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Julien Grall <julien.grall@citrix.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: xen-devel@lists.xenproject.org Cc: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-09-30 11:44:34 +01:00
Sebastian Andrzej Siewior	c8761e2016	xen/events: Convert to hotplug state machine Install the callbacks via the state machine. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-09-30 11:23:04 +01:00
Juergen Gross	5b00b504b1	xen: rename xen_pmu_init() in sys-hypervisor.c There are two functions with name xen_pmu_init() in the kernel. Rename the one in drivers/xen/sys-hypervisor.c to avoid shadowing the one in arch/x86/xen/pmu.c To avoid the same problem in future rename some more functions. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-08-24 18:45:25 +01:00
Jan Beulich	9a035a40f7	xenbus: don't look up transaction IDs for ordinary writes This should really only be done for XS_TRANSACTION_END messages, or else at least some of the xenstore-* tools don't work anymore. Fixes: `0beef634b8` ("xenbus: don't BUG() on user mode induced condition") Reported-by: Richard Schütz <rschuetz@uni-koblenz.de> Cc: <stable@vger.kernel.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Richard Schütz <rschuetz@uni-koblenz.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-08-24 18:16:18 +01:00
Krzysztof Kozlowski	00085f1efa	dma-mapping: use unsigned long for dma_attrs The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-08-04 08:50:07 -04:00
Linus Torvalds	08fd8c1768	xen: features and fixes for 4.8-rc0 - ACPI support for guests on ARM platforms. - Generic steal time support for arm and x86. - Support cases where kernel cpu is not Xen VCPU number (e.g., if in-guest kexec is used). - Use the system workqueue instead of a custom workqueue in various places. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXmLlrAAoJEFxbo/MsZsTRvRQH/1wOMF8BmlbZfR7H3qwDfjst ApNifCiZE08xDtWBlwUaBFAQxyflQS9BBiNZDVK0sysIdXeOdpWV7V0ZjRoLL+xr czsaaGXDcmXxJxApoMDVuT7FeP6rEk6LVAYRoHpVjJjMZGW3BbX1vZaMW4DXl2WM 9YNaF2Lj+rpc1f8iG31nUxwkpmcXFog6ct4tu7HiyCFT3hDkHt/a4ghuBdQItCkd vqBa1pTpcGtQBhSmWzlylN/PV2+NKcRd+kGiwd09/O/rNzogTMCTTWeHKAtMpPYb Cu6oSqJtlK5o0vtr0qyLSWEGIoyjE2gE92s0wN3iCzFY1PldqdsxUO622nIj+6o= =G6q3 -----END PGP SIGNATURE----- Merge tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Features and fixes for 4.8-rc0: - ACPI support for guests on ARM platforms. - Generic steal time support for arm and x86. - Support cases where kernel cpu is not Xen VCPU number (e.g., if in-guest kexec is used). - Use the system workqueue instead of a custom workqueue in various places" * tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (47 commits) xen: add static initialization of steal_clock op to xen_time_ops xen/pvhvm: run xen_vcpu_setup() for the boot CPU xen/evtchn: use xen_vcpu_id mapping xen/events: fifo: use xen_vcpu_id mapping xen/events: use xen_vcpu_id mapping in events_base x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op xen: introduce xen_vcpu_id mapping x86/acpi: store ACPI ids from MADT for future usage x86/xen: update cpuid.h from Xen-4.7 xen/evtchn: add IOCTL_EVTCHN_RESTRICT xen-blkback: really don't leak mode property xen-blkback: constify instance of "struct attribute_group" xen-blkfront: prefer xenbus_scanf() over xenbus_gather() xen-blkback: prefer xenbus_scanf() over xenbus_gather() xen: support runqueue steal time on xen arm/xen: add support for vm_assist hypercall xen: update xen headers xen-pciback: drop superfluous variables xen-pciback: short-circuit read path used for merging write values ...	2016-07-27 11:35:37 -07:00
Vlastimil Babka	8ea1d2a198	mm, frontswap: convert frontswap_enabled to static key I have noticed that frontswap.h first declares "frontswap_enabled" as extern bool variable, and then overrides it with "#define frontswap_enabled (1)" for CONFIG_FRONTSWAP=Y or (0) when disabled. The bool variable isn't actually instantiated anywhere. This all looks like an unfinished attempt to make frontswap_enabled reflect whether a backend is instantiated. But in the current state, all frontswap hooks call unconditionally into frontswap.c just to check if frontswap_ops is non-NULL. This should at least be checked inline, but we can further eliminate the overhead when CONFIG_FRONTSWAP is enabled and no backend registered, using a static key that is initially disabled, and gets enabled only upon first backend registration. Thus, checks for "frontswap_enabled" are replaced with "frontswap_enabled()" wrapping the static key check. There are two exceptions: - xen's selfballoon_process() was testing frontswap_enabled in code guarded by #ifdef CONFIG_FRONTSWAP, which was effectively always true when reachable. The patch just removes this check. Using frontswap_enabled() does not sound correct here, as this can be true even without xen's own backend being registered. - in SYSCALL_DEFINE2(swapon), change the check to IS_ENABLED(CONFIG_FRONTSWAP) as it seems the bitmap allocation cannot currently be postponed until a backend is registered. This means that frontswap will still have some memory overhead by being configured, but without a backend. After the patch, we can expect that some functions in frontswap.c are called only when frontswap_ops is non-NULL. Change the checks there to VM_BUG_ONs. While at it, convert other BUG_ONs to VM_BUG_ONs as frontswap has been stable for some time. [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1463152235-9717-1-git-send-email-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-26 16:19:19 -07:00
Juergen Gross	d34c30cc1f	xen: add static initialization of steal_clock op to xen_time_ops pv_time_ops might be overwritten with xen_time_ops after the steal_clock operation has been initialized already. To prevent calling a now uninitialized function pointer add the steal_clock static initialization to xen_time_ops. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-26 14:07:06 +01:00
Vitaly Kuznetsov	cbbb468239	xen/evtchn: use xen_vcpu_id mapping Use the newly introduced xen_vcpu_id mapping to get Xen's idea of vCPU id for CPU0. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-25 13:34:18 +01:00
Vitaly Kuznetsov	be78da1cf4	xen/events: fifo: use xen_vcpu_id mapping EVTCHNOP_init_control has vCPU id as a parameter and Xen's idea of vCPU id should be used. Use the newly introduced xen_vcpu_id mapping to convert it from Linux's id. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-25 13:34:12 +01:00
Vitaly Kuznetsov	8058c0b897	xen/events: use xen_vcpu_id mapping in events_base EVTCHNOP_bind_ipi and EVTCHNOP_bind_virq pass vCPU id as a parameter and Xen's idea of vCPU id should be used. Use the newly introduced xen_vcpu_id mapping to convert it from Linux's id. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-25 13:34:06 +01:00
Vitaly Kuznetsov	ad5475f9fa	x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op HYPERVISOR_vcpu_op() passes Linux's idea of vCPU id as a parameter while Xen's idea is expected. In some cases these ideas diverge so we need to do remapping. Convert all callers of HYPERVISOR_vcpu_op() to use xen_vcpu_nr(). Leave xen_fill_possible_map() and xen_filter_cpu_maps() intact as they're only being called by PV guests before perpu areas are initialized. While the issue could be solved by switching to early_percpu for xen_vcpu_id I think it's not worth it: PV guests will probably never get to the point where their idea of vCPU id diverges from Xen's. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-25 13:32:34 +01:00
David Vrabel	fbc872c38c	xen/evtchn: add IOCTL_EVTCHN_RESTRICT IOCTL_EVTCHN_RESTRICT limits the file descriptor to being able to bind to interdomain event channels from a specific domain. Event channels that are already bound continue to work for sending and receiving notifications. This is useful as part of deprivileging a user space PV backend or device model (QEMU). e.g., Once the device model as bound to the ioreq server event channels it can restrict the file handle so an exploited DM cannot use it to create or bind to arbitrary event channels. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2016-07-25 10:59:31 +01:00
Jan Beulich	6f2d9d9921	xen/acpi: allow xen-acpi-processor driver to load on Xen 4.7 As of Xen 4.7 PV CPUID doesn't expose either of CPUID[1].ECX[7] and CPUID[0x80000007].EDX[7] anymore, causing the driver to fail to load on both Intel and AMD systems. Doing any kind of hardware capability checks in the driver as a prerequisite was wrong anyway: With the hypervisor being in charge, all such checking should be done by it. If ACPI data gets uploaded despite some missing capability, the hypervisor is free to ignore part or all of that data. Ditch the entire check_prereq() function, and do the only valid check (xen_initial_domain()) in the caller in its place. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-08 14:53:13 +01:00
Jan Beulich	e5a79475a7	xenbus: simplify xenbus_dev_request_and_reply() No need to retain a local copy of the full request message, only the type is really needed. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-08 11:50:29 +01:00
Jan Beulich	7469be95a4	xenbus: don't bail early from xenbus_dev_request_and_reply() xenbus_dev_request_and_reply() needs to track whether a transaction is open. For XS_TRANSACTION_START messages it calls transaction_start() and for XS_TRANSACTION_END messages it calls transaction_end(). If sending an XS_TRANSACTION_START message fails or responds with an an error, the transaction is not open and transaction_end() must be called. If sending an XS_TRANSACTION_END message fails, the transaction is still open, but if an error response is returned the transaction is closed. Commit `027bd7e899` ("xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart") introduced a regression where failed XS_TRANSACTION_START messages were leaving the transaction open. This can cause problems with suspend (and migration) as all transactions must be closed before suspending. It appears that the problematic change was added accidentally, so just remove it. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-08 11:14:26 +01:00
Jan Beulich	0beef634b8	xenbus: don't BUG() on user mode induced condition Inability to locate a user mode specified transaction ID should not lead to a kernel crash. For other than XS_TRANSACTION_START also don't issue anything to xenbus if the specified ID doesn't match that of any active transaction. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-07 12:19:52 +01:00
Juergen Gross	6ba286ad84	xen: support runqueue steal time on xen Up to now reading the stolen time of a remote cpu was not possible in a performant way under Xen. This made support of runqueue steal time via paravirt_steal_rq_enabled impossible. With the addition of an appropriate hypervisor interface this is now possible, so add the support. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:42:19 +01:00
Jan Beulich	1ad6344acf	xen-pciback: drop superfluous variables req_start is simply an alias of the "offset" function parameter, and req_end is being used just once in each function. (And both variables were loop invariant anyway, so should at least have got initialized outside the loop.) Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:35:38 +01:00
Jan Beulich	ee87d6d0d3	xen-pciback: short-circuit read path used for merging write values There's no point calling xen_pcibk_config_read() here - all it'll do is return whatever conf_space_read() returns for the field which was found here (and which would be found there again). Also there's no point clearing tmp_val before the call. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:35:38 +01:00
Jan Beulich	585203609c	xen-pciback: use const and unsigned in bar_init() Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:35:38 +01:00
Jan Beulich	c8670c22e0	xen-pciback: simplify determination of 64-bit memory resource Other than for raw BAR values, flags are properly separated in the internal representation. Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:35:37 +01:00
Jan Beulich	6ad2655d87	xen-pciback: fold read_dev_bar() into its now single caller Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:35:37 +01:00
Jan Beulich	664093bb6b	xen-pciback: drop rom_init() It is now identical to bar_init(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:35:37 +01:00
Jan Beulich	6c6e4caa20	xen-pciback: drop unused function parameter of read_dev_bar() Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:35:37 +01:00
Bhaktipriya Shridhar	5ee405d9d2	xen: xenbus: Remove create_workqueue System workqueues have been able to handle high level of concurrency for a long time now and there's no reason to use dedicated workqueues just to gain concurrency. Replace dedicated xenbus_frontend_wq with the use of system_wq. Unlike a dedicated per-cpu workqueue created with create_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantees unless the target CPU is explicitly specified and the increase of local concurrency shouldn't make any difference. In this case, there is only a single work item, increase of concurrency level by switching to system_wq should not make any difference. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:34:49 +01:00
Bhaktipriya Shridhar	429eafe609	xen: xen-pciback: Remove create_workqueue System workqueues have been able to handle high level of concurrency for a long time now and there's no reason to use dedicated workqueues just to gain concurrency. Replace dedicated xen_pcibk_wq with the use of system_wq. Unlike a dedicated per-cpu workqueue created with create_workqueue(), system_wq allows multiple work items to overlap executions even on the same CPU; however, a per-cpu workqueue doesn't have any CPU locality or global ordering guarantees unless the target CPU is explicitly specified and thus the increase of local concurrency shouldn't make any difference. Since the work items could be pending, flush_work() has been used in xen_pcibk_disconnect(). xen_pcibk_xenbus_remove() calls free_pdev() which in turn calls xen_pcibk_disconnect() for every pdev to ensure that there is no pending task while disconnecting the driver. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:34:48 +01:00
Juergen Gross	ecb23dc6f2	xen: add steal_clock support on x86 The pv_time_ops structure contains a function pointer for the "steal_clock" functionality used only by KVM and Xen on ARM. Xen on x86 uses its own mechanism to account for the "stolen" time a thread wasn't able to run due to hypervisor scheduling. Add support in Xen arch independent time handling for this feature by moving it out of the arm arch into drivers/xen and remove the x86 Xen hack. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:34:48 +01:00
Muhammad Falak R Wani	c7ebf9d9c6	xen: use vma_pages(). Replace explicit computation of vma page count by a call to vma_pages(). Signed-off-by: Muhammad Falak R Wani <falakreyaz@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-07-06 10:34:47 +01:00
Shannon Zhao	be1aaf4e40	ARM64: XEN: Add a function to initialize Xen specific UEFI runtime services When running on Xen hypervisor, runtime services are supported through hypercall. Add a Xen specific function to initialize runtime services. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Julien Grall <julien.grall@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com>	2016-07-06 10:34:46 +01:00
Shannon Zhao	a62ed50030	XEN: EFI: Move x86 specific codes to architecture directory Move x86 specific codes to architecture directory and export those EFI runtime service functions. This will be useful for initializing runtime service on ARM later. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>	2016-07-06 10:34:46 +01:00
Shannon Zhao	5789afeb0e	Xen: ARM: Add support for mapping AMBA device mmio Add a bus_notifier for AMBA bus device in order to map the device mmio regions when DOM0 booting with ACPI. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Tested-by: Julien Grall <julien.grall@arm.com>	2016-07-06 10:34:43 +01:00
Shannon Zhao	4ba04bec37	Xen: ARM: Add support for mapping platform device mmio Add a bus_notifier for platform bus device in order to map the device mmio regions when DOM0 booting with ACPI. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Julien Grall <julien.grall@arm.com>	2016-07-06 10:34:43 +01:00
Shannon Zhao	975fac3c4f	Xen: xlate: Use page_to_xen_pfn instead of page_to_pfn Make xen_xlate_map_ballooned_pages work with 64K pages. In that case Kernel pages are 64K in size but Xen pages remain 4K in size. Xen pfns refer to 4K pages. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Tested-by: Julien Grall <julien.grall@arm.com>	2016-07-06 10:34:42 +01:00
Shannon Zhao	243848fc01	xen/grant-table: Move xlated_setup_gnttab_pages to common place Move xlated_setup_gnttab_pages to common place, so it can be reused by ARM to setup grant table. Rename it to xen_xlate_map_ballooned_pages. Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Tested-by: Julien Grall <julien.grall@arm.com>	2016-07-06 10:34:42 +01:00
Jan Beulich	d2bd05d88d	xen-pciback: return proper values during BAR sizing Reads following writes with all address bits set to 1 should return all changeable address bits as one, not the BAR size (nor, as was the case for the upper half of 64-bit BARs, the high half of the region's end address). Presumably this didn't cause any problems so far because consumers use the value to calculate the size (usually via val & -val), and do nothing else with it. But also consider the exception here: Unimplemented BARs should always return all zeroes. And finally, the check for whether to return the sizing address on read for the ROM BAR should ignore all non-address bits, not just the ROM Enable one. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-06-24 10:53:03 +01:00
Andrey Grodzovsky	02ef871eca	xen/pciback: Fix conf_space read/write overlap check. Current overlap check is evaluating to false a case where a filter field is fully contained (proper subset) of a r/w request. This change applies classical overlap check instead to include all the scenarios. More specifically, for (Hilscher GmbH CIFX 50E-DP(M/S)) device driver the logic is such that the entire confspace is read and written in 4 byte chunks. In this case as an example, CACHE_LINE_SIZE, LATENCY_TIMER and PCI_BIST are arriving together in one call to xen_pcibk_config_write() with offset == 0xc and size == 4. With the exsisting overlap check the LATENCY_TIMER field (offset == 0xd, length == 1) is fully contained in the write request and hence is excluded from write, which is incorrect. Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Jan Beulich <JBeulich@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-06-23 11:36:15 +01:00
Ross Lagerwall	842775f150	xen/balloon: Fix declared-but-not-defined warning Fix a declared-but-not-defined warning when building with XEN_BALLOON_MEMORY_HOTPLUG=n. This fixes a regression introduced by commit `dfd74a1edf` ("xen/balloon: Fix crash when ballooning on x86 32 bit PAE"). Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-06-23 11:36:15 +01:00
Linus Torvalds	9ba55cf7cf	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Here are the outstanding target pending updates for v4.7-rc1. The highlights this round include: - Allow external PR/ALUA metadata path be defined at runtime via top level configfs attribute (Lee) - Fix target session shutdown bug for ib_srpt multi-channel (hch) - Make TFO close_session() and shutdown_session() optional (hch) - Drop se_sess->sess_kref + convert tcm_qla2xxx to internal kref (hch) - Add tcm_qla2xxx endpoint attribute for basic FC jammer (Laurence) - Refactor iscsi-target RX/TX PDU encode/decode into common code (Varun) - Extend iscsit_transport with xmit_pdu, release_cmd, get_rx_pdu, validate_parameters, and get_r2t_ttt for generic ISO offload (Varun) - Initial merge of cxgb iscsi-segment offload target driver (Varun) The bulk of the changes are Chelsio's new driver, along with a number of iscsi-target common code improvements made by Varun + Co along the way" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (29 commits) iscsi-target: Fix early sk_data_ready LOGIN_FLAGS_READY race cxgbit: Use type ISCSI_CXGBIT + cxgbit tpg_np attribute iscsi-target: Convert transport drivers to signal rdma_shutdown iscsi-target: Make iscsi_tpg_np driver show/store use generic code tcm_qla2xxx Add SCSI command jammer/discard capability iscsi-target: graceful disconnect on invalid mapping to iovec target: need_to_release is always false, remove redundant check and kfree target: remove sess_kref and ->shutdown_session iscsi-target: remove usage of ->shutdown_session tcm_qla2xxx: introduce a private sess_kref target: make close_session optional target: make ->shutdown_session optional target: remove acl_stop target: consolidate and fix session shutdown cxgbit: add files for cxgbit.ko iscsi-target: export symbols iscsi-target: call complete on conn_logout_comp iscsi-target: clear tx_thread_active iscsi-target: add new offload transport type iscsi-target: use conn_transport->transport_type in text rsp ...	2016-05-28 12:04:17 -07:00
Linus Torvalds	29567292c0	xen: bug fixes for 4.7-rc0 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXRGqXAAoJEFxbo/MsZsTRSr8H/AwybQygAqBPoxsC2VkdSD0u csHnXuG/37ABep55A0ZgNIwq/JNK1m7K0+Xq6S6pO6UTH8gl32Fx2uPiHoXEdzmK KfwQPx8ETaVtOSrzPJdPdspXTNngEsvopTPGPRKG6VNE3JwtTiwkHcXiez6nJ8XB 2YqGjYOdKNsu/sAUo5aCrc7eCcdCu/hc6pWFNtfSBRSLhR1Eofl81B6B0BCvi5D0 5/TOX+YnZ4omNeYlxghFvZKa/HyfE/0CONbyijjVoLcd8i2EptNosdCc9WgRTOAZ nwnsABYg0izPZYNgr5HJUwfNLnVymtg8gg7b2WpKkM0Xvd7uPjrYA1vpHh1yA48= =OfVJ -----END PGP SIGNATURE----- Merge tag 'for-linus-4.7-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel. * tag 'for-linus-4.7-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: use same main loop for counting and remapping pages xen/events: Don't move disabled irqs xen/x86: actually allocate legacy interrupts on PV guests Xen: don't warn about 2-byte wchar_t in efi xen/gntdev: reduce copy batch size to 16 xen/x86: don't lose event interrupts	2016-05-24 10:22:34 -07:00
Ross Lagerwall	f0f393877c	xen/events: Don't move disabled irqs Commit `ff1e22e7a6` ("xen/events: Mask a moving irq") open-coded irq_move_irq() but left out checking if the IRQ is disabled. This broke resuming from suspend since it tries to move a (disabled) irq without holding the IRQ's desc->lock. Fix it by adding in a check for disabled IRQs. The resulting stacktrace was: kernel BUG at /build/linux-UbQGH5/linux-4.4.0/kernel/irq/migration.c:31! invalid opcode: 0000 [#1] SMP Modules linked in: xenfs xen_privcmd ... CPU: 0 PID: 9 Comm: migration/0 Not tainted 4.4.0-22-generic #39-Ubuntu Hardware name: Xen HVM domU, BIOS 4.6.1-xs125180 05/04/2016 task: ffff88003d75ee00 ti: ffff88003d7bc000 task.ti: ffff88003d7bc000 RIP: 0010:[<ffffffff810e26e2>] [<ffffffff810e26e2>] irq_move_masked_irq+0xd2/0xe0 RSP: 0018:ffff88003d7bfc50 EFLAGS: 00010046 RAX: 0000000000000000 RBX: ffff88003d40ba00 RCX: 0000000000000001 RDX: 0000000000000001 RSI: 0000000000000100 RDI: ffff88003d40bad8 RBP: ffff88003d7bfc68 R08: 0000000000000000 R09: ffff88003d000000 R10: 0000000000000000 R11: 000000000000023c R12: ffff88003d40bad0 R13: ffffffff81f3a4a0 R14: 0000000000000010 R15: 00000000ffffffff FS: 0000000000000000(0000) GS:ffff88003da00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd4264de624 CR3: 0000000037922000 CR4: 00000000003406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Stack: ffff88003d40ba38 0000000000000024 0000000000000000 ffff88003d7bfca0 ffffffff814c8d92 00000010813ef89d 00000000805ea732 0000000000000009 0000000000000024 ffff88003cc39b80 ffff88003d7bfce0 ffffffff814c8f66 Call Trace: [<ffffffff814c8d92>] eoi_pirq+0xb2/0xf0 [<ffffffff814c8f66>] __startup_pirq+0xe6/0x150 [<ffffffff814ca659>] xen_irq_resume+0x319/0x360 [<ffffffff814c7e75>] xen_suspend+0xb5/0x180 [<ffffffff81120155>] multi_cpu_stop+0xb5/0xe0 [<ffffffff811200a0>] ? cpu_stop_queue_work+0x80/0x80 [<ffffffff811203d0>] cpu_stopper_thread+0xb0/0x140 [<ffffffff810a94e6>] ? finish_task_switch+0x76/0x220 [<ffffffff810ca731>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20 [<ffffffff810a3935>] smpboot_thread_fn+0x105/0x160 [<ffffffff810a3830>] ? sort_range+0x30/0x30 [<ffffffff810a0588>] kthread+0xd8/0xf0 [<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0 [<ffffffff8182568f>] ret_from_fork+0x3f/0x70 [<ffffffff810a04b0>] ? kthread_create_on_node+0x1e0/0x1e0 Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-05-24 12:58:54 +01:00
Arnd Bergmann	971a69db7d	Xen: don't warn about 2-byte wchar_t in efi The XEN UEFI code has become available on the ARM architecture recently, but now causes a link-time warning: ld: warning: drivers/xen/efi.o uses 2-byte wchar_t yet the output is to use 4-byte wchar_t; use of wchar_t values across objects may fail This seems harmless, because the efi code only uses 2-byte characters when interacting with EFI, so we don't pass on those strings to elsewhere in the system, and we just need to silence the warning. It is not clear to me whether we actually need to build the file with the -fshort-wchar flag, but if we do, then we should also pass --no-wchar-size-warning to the linker, to avoid the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Fixes: 37060935dc04 ("ARM64: XEN: Add a function to initialize Xen specific UEFI runtime services")	2016-05-24 12:58:18 +01:00
David Vrabel	36ae220aa6	xen/gntdev: reduce copy batch size to 16 IOCTL_GNTDEV_GRANT_COPY batches copy operations to reduce the number of hypercalls. The stack is used to avoid a memory allocation in a hot path. However, a batch size of 24 requires more than 1024 bytes of stack which in some configurations causes a compiler warning. xen/gntdev.c: In function ‘gntdev_ioctl_grant_copy’: xen/gntdev.c:949:1: warning: the frame size of 1248 bytes is larger than 1024 bytes [-Wframe-larger-than=] This is a harmless warning as there is still plenty of stack spare, but people keep trying to "fix" it. Reduce the batch size to 16 to reduce stack usage to less than 1024 bytes. This should have minimal impact on performance. Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-05-24 12:58:17 +01:00
Christoph Hellwig	36ec2ddc0d	target: make close_session optional Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2016-05-10 01:19:26 -07:00
Christoph Hellwig	22d11759a4	target: make ->shutdown_session optional Turns out the template and thus many drivers got the return value wrong: 0 means the fabrics driver needs to put a session reference, which no driver except for the iSCSI target drivers did. Fortunately none of these drivers supports explicit Node ACLs, so the bug was harmless. Even without that only qla2xxx and iscsi every did real work in shutdown_session, so get rid of the boilerplate code in all other drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2016-05-10 01:19:18 -07:00
Ingo Molnar	35dc9ec107	Merge branch 'linus' into efi/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-05-07 07:00:07 +02:00
Jan Beulich	27e0e63853	xen/evtchn: fix ring resize when binding new events The copying of ring data was wrong for two cases: For a full ring nothing got copied at all (as in that case the canonicalized producer and consumer indexes are identical). And in case one or both of the canonicalized (after the resize) indexes would point into the second half of the buffer, the copied data ended up in the wrong (free) part of the new buffer. In both cases uninitialized data would get passed back to the caller. Fix this by simply copying the old ring contents twice: Once to the low half of the new buffer, and a second time to the high half. This addresses the inability to boot a HVM guest with 64 or more vCPUs. This regression was caused by `8620015499` (xen/evtchn: dynamically grow pending event channel ring). Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: <stable@vger.kernel.org> # 4.4+ Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-05-04 16:37:01 +01:00
Ingo Molnar	0ec7ae928a	efi: Remove unnecessary (and buggy) .memmap initialization from the Xen EFI driver So the following commit: `884f4f66ff` ("efi: Remove global 'memmap' EFI memory map") ... triggered the following build warning on x86 64-bit allyesconfig: drivers/xen/efi.c:290:47: warning: missing braces around initializer [-Wmissing-braces] It's this initialization in drivers/xen/efi.c: static const struct efi efi_xen __initconst = { ... .memmap = NULL, /* Not used under Xen. */ ... which was forgotten about, as .memmap now is an embedded struct: struct efi_memory_map memmap; We can remove this initialization - it's an EFI core internal data structure plus it's not used in the Xen driver anyway. Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: ard.biesheuvel@linaro.org Cc: bp@alien8.de Cc: linux-tip-commits@vger.kernel.org Cc: tony.luck@intel.com Link: http://lkml.kernel.org/r/20160429083128.GA4925@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-04-29 11:06:15 +02:00
Ross Lagerwall	dfd74a1edf	xen/balloon: Fix crash when ballooning on x86 32 bit PAE Commit `55b3da98a4` (xen/balloon: find non-conflicting regions to place hotplugged memory) caused a regression in 4.4. When ballooning on an x86 32 bit PAE system with close to 64 GiB of memory, the address returned by allocate_resource may be above 64 GiB. When using CONFIG_SPARSEMEM, this setup is limited to using physical addresses < 64 GiB. When adding memory at this address, it runs off the end of the mem_section array and causes a crash. Instead, fail the ballooning request. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: <stable@vger.kernel.org> # 4.4+ Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-04-06 11:29:18 +01:00
Boris Ostrovsky	ff1e22e7a6	xen/events: Mask a moving irq Moving an unmasked irq may result in irq handler being invoked on both source and target CPUs. With 2-level this can happen as follows: On source CPU: evtchn_2l_handle_events() -> generic_handle_irq() -> handle_edge_irq() -> eoi_pirq(): irq_move_irq(data); /*** WE ARE HERE **/ if (VALID_EVTCHN(evtchn)) clear_evtchn(evtchn); If at this moment target processor is handling an unrelated event in evtchn_2l_handle_events()'s loop it may pick up our event since target's cpu_evtchn_mask claims that this event belongs to it and* the event is unmasked and still pending. At the same time, source CPU will continue executing its own handle_edge_irq(). With FIFO interrupt the scenario is similar: irq_move_irq() may result in a EVTCHNOP_unmask hypercall which, in turn, may make the event pending on the target CPU. We can avoid this situation by moving and clearing the event while keeping event masked. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-04-04 11:18:00 +01:00
Linus Torvalds	55fc733c7e	xen: features and fixes for 4.6-rc0 - Make earlyprintk=xen work for HVM guests. - Remove module support for things never built as modules. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJW8YhlAAoJEFxbo/MsZsTRiCgH/3GnL93s/BsRTrA/DYykYqA4 rPPkDd6WMlEbrko6FY7o4IxUAZ50LWU8wjmDNVTT+R/VKlWCVh2+ksxjbH8e0nzr 0QohAYTdG1VGmhkrwTjKj0wC3Nr8vaDqoBXAFRz4DiQdbobn12wzXjaVrbl8RRNc oHDqR+DfZj6ontqx8oFkkTk7CFKIGEOtm/uBhCAV2fNkQSUCzuQyKLlarxM6jq/2 EnLR1bhnyoy5zNFzXxmaJgZOit8QFMKvPdDRknlp9yDHYJ5zolJkv+6FPiAG6SZs A8yAUPQoAD+ekAiV2DIhb8qoNNNdZXdRCFBpoudwX3GyyU4O2P5Intl0xEg17Nk= =L8S4 -----END PGP SIGNATURE----- Merge tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Features and fixes for 4.6: - Make earlyprintk=xen work for HVM guests - Remove module support for things never built as modules" * tag 'for-linus-4.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: drivers/xen: make platform-pci.c explicitly non-modular drivers/xen: make sys-hypervisor.c explicitly non-modular drivers/xen: make xenbus_dev_[front/back]end explicitly non-modular drivers/xen: make [xen-]ballon explicitly non-modular xen: audit usages of module.h ; remove unnecessary instances xen/x86: Drop mode-selecting ifdefs in startup_xen() xen/x86: Zero out .bss for PV guests hvc_xen: make early_printk work with HVM guests hvc_xen: fix xenboot for DomUs hvc_xen: add earlycon support	2016-03-22 12:55:17 -07:00
Linus Torvalds	5266e5b12c	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "The highlights this round include: - Add target_alloc_session() w/ callback helper for doing se_session allocation + tag + se_node_acl lookup. (HCH + nab) - Tree-wide fabric driver conversion to use target_alloc_session() - Convert sbp-target to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Chris Boot + nab) - Convert usb-gadget to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Andrzej Pietrasiewicz + nab) - Convert xen-scsiback to use percpu_ida tag pre-allocation, and TARGET_SCF_ACK_KREF I/O krefs (Juergen Gross + nab) - Convert tcm_fc to use TARGET_SCF_ACK_KREF I/O + TMR krefs - Convert ib_srpt to use percpu_ida tag pre-allocation - Add DebugFS node for qla2xxx target sess list (Quinn) - Rework iser-target connection termination (Jenny + Sagi) - Convert iser-target to new CQ API (HCH) - Add pass-through WRITE_SAME support for IBLOCK (Mike Christie) - Introduce data_bitmap for asynchronous access of data area (Sheng Yang + Andy) - Fix target_release_cmd_kref shutdown comp leak (Himanshu Madhani) Also, there is a separate PULL request coming for cxgb4 NIC driver prerequisites for supporting hw iscsi segmentation offload (ISO), that will be the base for a number of v4.7 developments involving iscsi-target hw offloads" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits) target: Fix target_release_cmd_kref shutdown comp leak target: Avoid DataIN transfers for non-GOOD SAM status target/user: Report capability of handling out-of-order completions to userspace target/user: Fix size_t format-spec build warning target/user: Don't free expired command when time out target/user: Introduce data_bitmap, replace data_length/data_head/data_tail target/user: Free data ring in unified function target/user: Use iovec[] to describe continuous area target: Remove enum transport_lunflags_table target/iblock: pass WRITE_SAME to device if possible iser-target: Kill the ->isert_cmd back pointer in struct iser_tx_desc iser-target: Kill struct isert_rdma_wr iser-target: Convert to new CQ API iser-target: Split and properly type the login buffer iser-target: Remove ISER_RECV_DATA_SEG_LEN iser-target: Remove impossible condition from isert_wait_conn iser-target: Remove redundant wait in release_conn iser-target: Rework connection termination iser-target: Separate flows for np listeners and connections cma events iser-target: Add new state ISER_CONN_BOUND to isert_conn ...	2016-03-22 12:41:14 -07:00
Paul Gortmaker	e01dc539df	drivers/xen: make platform-pci.c explicitly non-modular The Kconfig currently controlling compilation of this code is: arch/x86/xen/Kconfig:config XEN_PVHVM arch/x86/xen/Kconfig: def_bool y ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code. We also delete the MODULE_LICENSE tag etc. since all that information was (or is now) contained at the top of the file in the comments. In removing "module" from the init fcn name, we observe a namespace collision with the probe function, so we use "probe" in the name of the probe function, and "init" in the registration fcn, as per standard convention, as suggested by Stefano. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-03-21 15:14:04 +00:00
Paul Gortmaker	46894f17af	drivers/xen: make sys-hypervisor.c explicitly non-modular The Kconfig currently controlling compilation of this code is: config XEN_SYS_HYPERVISOR bool "Create xen entries under /sys/hypervisor" ...meaning that it currently is not being built as a module by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. However one could argue that fs_initcall() might make more sense here. This change means that the one line function xen_properties_destroy() has only one user left, and since that is inside an #ifdef, we just manually inline it there vs. adding more ifdeffery around the function to avoid compile warnings about "defined but not used". In order to be consistent we also manually inline the other _destroy functions that are also just one line sysfs functions calls with only one call site remaing, even though they wouldn't need #ifdeffery. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-03-21 15:14:03 +00:00
Paul Gortmaker	ab1241a1fc	drivers/xen: make xenbus_dev_[front/back]end explicitly non-modular The Makefile / Kconfig currently controlling compilation here is: obj-y += xenbus_dev_frontend.o [...] obj-$(CONFIG_XEN_BACKEND) += xenbus_dev_backend.o ...with: drivers/xen/Kconfig:config XEN_BACKEND drivers/xen/Kconfig: bool "Backend driver support" ...meaning that they currently are not being built as modules by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. Since module_init translates to device_initcall in the non-modular case, the init ordering remains unchanged with this commit. We also delete the MODULE_LICENSE tag since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-03-21 15:13:55 +00:00
Paul Gortmaker	106eaa8e6e	drivers/xen: make [xen-]ballon explicitly non-modular The Makefile / Kconfig currently controlling compilation here is: obj-y += grant-table.o features.o balloon.o manage.o preempt.o time.o [...] obj-$(CONFIG_XEN_BALLOON) += xen-balloon.o ...with: drivers/xen/Kconfig:config XEN_BALLOON drivers/xen/Kconfig: bool "Xen memory balloon driver" ...meaning that they currently are not being built as modules by anyone. Lets remove the modular code that is essentially orphaned, so that when reading the driver there is no doubt it is builtin-only. In doing so we uncover two implict includes that were obtained by module.h having such a wide include scope itself: In file included from drivers/xen/xen-balloon.c:41:0: include/xen/balloon.h:26:51: warning: ‘struct page’ declared inside parameter list [enabled by default] int alloc_xenballooned_pages(int nr_pages, struct page **pages); ^ include/xen/balloon.h: In function ‘register_xen_selfballooning’: include/xen/balloon.h:35:10: error: ‘ENOSYS’ undeclared (first use in this function) return -ENOSYS; ^ This is fixed by adding mm-types.h and errno.h to the list. We also delete the MODULE_LICENSE tags since all that information is already contained at the top of the file in the comments. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-03-21 15:13:44 +00:00
Paul Gortmaker	59aa56bf2a	xen: audit usages of module.h ; remove unnecessary instances Code that uses no modular facilities whatsoever should not be sourcing module.h at all, since that header drags in a bunch of other headers with it. Similarly, code that is not explicitly using modular facilities like module_init() but only is declaring module_param setup variables should be using moduleparam.h and not the larger module.h file for that. In making this change, we also uncover an implicit use of BUG() in inline fcns within arch/arm/include/asm/xen/hypercall.h so we explicitly source <linux/bug.h> for that file now. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-03-21 15:13:32 +00:00
Vitaly Kuznetsov	703fc13a3f	xen_balloon: support memory auto onlining policy Add support for the newly added kernel memory auto onlining policy to Xen ballon driver. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Suggested-by: Daniel Kiper <daniel.kiper@oracle.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Acked-by: David Vrabel <david.vrabel@citrix.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-15 16:55:16 -07:00
Vitaly Kuznetsov	31bc3858ea	memory-hotplug: add automatic onlining policy for the newly added memory Currently, all newly added memory blocks remain in 'offline' state unless someone onlines them, some linux distributions carry special udev rules like: SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online" to make this happen automatically. This is not a great solution for virtual machines where memory hotplug is being used to address high memory pressure situations as such onlining is slow and a userspace process doing this (udev) has a chance of being killed by the OOM killer as it will probably require to allocate some memory. Introduce default policy for the newly added memory blocks in /sys/devices/system/memory/auto_online_blocks file with two possible values: "offline" which preserves the current behavior and "online" which causes all newly added memory blocks to go online as soon as they're added. The default is "offline". Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: David Vrabel <david.vrabel@citrix.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: Kay Sievers <kay@vrfy.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-15 16:55:16 -07:00
Peter Zijlstra	25528213fe	tags: Fix DEFINE_PER_CPU expansions $ make tags GEN tags ctags: Warning: drivers/acpi/processor_idle.c:64: null expansion of name pattern "\1" ctags: Warning: drivers/xen/events/events_2l.c:41: null expansion of name pattern "\1" ctags: Warning: kernel/locking/lockdep.c:151: null expansion of name pattern "\1" ctags: Warning: kernel/rcu/rcutorture.c:133: null expansion of name pattern "\1" ctags: Warning: kernel/rcu/rcutorture.c:135: null expansion of name pattern "\1" ctags: Warning: kernel/workqueue.c:323: null expansion of name pattern "\1" ctags: Warning: net/ipv4/syncookies.c:53: null expansion of name pattern "\1" ctags: Warning: net/ipv6/syncookies.c:44: null expansion of name pattern "\1" ctags: Warning: net/rds/page.c:45: null expansion of name pattern "\1" Which are all the result of the DEFINE_PER_CPU pattern: scripts/tags.sh:200: '/\<DEFINE_PER_CPU([^,], $[[:alnum:]_]$/\1/v/' scripts/tags.sh:201: '/\<DEFINE_PER_CPU_SHARED_ALIGNED([^,], $[[:alnum:]_]$/\1/v/' The below cures them. All except the workqueue one are within reasonable distance of the 80 char limit. TJ do you have any preference on how to fix the wq one, or shall we just not care its too long? Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Tejun Heo <tj@kernel.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-03-15 16:55:16 -07:00
Nicholas Bellinger	fa22e7b774	xen-scsiback: Convert to TARGET_SCF_ACK_KREF I/O krefs This patch converts xen-scsiback to modern TARGET_SCF_ACK_KREF usage for scsiback_cmd_done() callback path. It also also converts TMR -> scsiback_device_action() to use modern target_submit_tmr() code. Acked-by: Juergen Gross <jgross@suse.com> Tested-by: Juergen Gross <jgross@suse.com> Cc: Hannes Reinecke <hare@suse.de> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2016-03-10 21:48:20 -08:00
Nicholas Bellinger	2dbcdf33db	xen-scsiback: Convert to percpu_ida tag allocation This patch converts xen-scsiback to use percpu_ida tag pre-allocation for struct vscsibk_pend descriptor, in order to avoid fast-path struct vscsibk_pend memory allocations. Note by default this is currently hardcoded to 128. (Add wrapper for handling pending_req tag failure - Juergen) (Drop left-over se_cmd memset in scsiback_cmd_exec - Juergen) Acked-by: Juergen Gross <jgross@suse.com> Tested-by: Juergen Gross <jgross@suse.com> Cc: Hannes Reinecke <hare@suse.de> Cc: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2016-03-10 21:48:18 -08:00
Christoph Hellwig	fb444abe61	target: Convert demo-mode only drivers to target_alloc_session This patch converts existing loopback, usb-gadget, and xen-scsiback demo-mode only fabric drivers to use the new target_alloc_session API caller. This includes adding a new alloc_session callback for fabric driver internal nexus pointer assignments. (Fixes for early for-next nexus breakage - Dan Carpenter) Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Acked-by: Juergen Gross <jgross@suse.com> Tested-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Tested-by: Chris Boot <bootc@bootc.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2016-03-10 21:44:28 -08:00
Ingo Molnar	bc94b99636	Linux 4.5-rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJW0yM6AAoJEHm+PkMAQRiGeUwIAJRTHFPJTFpJcJjeZEV4/EL1 7Pl0WSHs/CWBkXIevAg2HgkECSQ9NI9FAUFvoGxCldDpFAnL1U2QV8+Ur2qhiXMG 5v0jILJuiw57qT/NfhEudZolerlRoHILmB3JRTb+DUV4GHZuWpTkJfUSI9j5aTEl w83XUgtK4bKeIyFbHdWQk6xqfzfFBSuEITuSXreOMwkFfMmeScE0WXOPLBZWyhPa v0rARJLYgM+vmRAnJjnG8unH+SgnqiNcn2oOFpevKwmpVcOjcEmeuxh/HdeZf7HM /R8F86OwdmXsO+z8dQxfcucLg+I9YmKfFr8b6hopu1sRztss2+Uk6H1j2J7IFIg= =tvkh -----END PGP SIGNATURE----- Merge tag 'v4.5-rc6' into core/resources, to resolve conflict Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-03-04 12:12:08 +01:00
Linus Torvalds	692b8c663c	Xen bug fixes for 4.5-rc5 - Two scsiback fixes (resource leak and spurious warning). - Fix DMA mapping of compound pages on arm/arm64. - Fix some pciback regressions in MSI-X handling. - Fix a pcifront crash due to some uninitialize state. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWyvatAAoJEFxbo/MsZsTRBFcH+wWnv0/N+gKib3cKCI4lwmTg n8iVgf8dNWwD36M2s/OlzCAglAIt8Xr6ySNvPqTerpm7lT9yXlIVQxGXTbIGuTAA h8Kt8WiC0BNLHHlLxBuCz62KR47DvMhsr84lFURE8FmpUiulFjXmRcbrZkHIMYRS l/X+xJWO1vxwrSYho0P9n3ksTWHm488DTPvZz3ICNI2G2sndDfbT3gv3tMDaQhcX ZaQR93vtIoldqk29Ga59vaVtksbgxHZIbasY9PQ8rqOxHJpDQbPzpjocoLxAzf50 cioQVyKQ7i9vUvZ+B3TTAOhxisA2hDwNhLGQzmjgxe2TXeKdo3yjYwO6m1dDBzY= =VY/S -----END PGP SIGNATURE----- Merge tag 'for-linus-4.5-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - Two scsiback fixes (resource leak and spurious warning). - Fix DMA mapping of compound pages on arm/arm64. - Fix some pciback regressions in MSI-X handling. - Fix a pcifront crash due to some uninitialize state. * tag 'for-linus-4.5-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted. xen/pcifront: Report the errors better. xen/pciback: Save the number of MSI-X entries to be copied later. xen/pciback: Check PF instead of VF for PCI_COMMAND_MEMORY xen: fix potential integer overflow in queue_reply xen/arm: correctly handle DMA mapping of compound pages xen/scsiback: avoid warnings when adding multiple LUNs to a domain xen/scsiback: correct frontend counting	2016-02-22 13:57:01 -08:00
Konrad Rzeszutek Wilk	d159457b84	xen/pciback: Save the number of MSI-X entries to be copied later. Commit `8135cf8b09` (xen/pciback: Save xen_pci_op commands before processing it) broke enabling MSI-X because it would never copy the resulting vectors into the response. The number of vectors requested was being overwritten by the return value (typically zero for success). Save the number of vectors before processing the op, so the correct number of vectors are copied afterwards. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-02-15 14:21:10 +00:00
Konrad Rzeszutek Wilk	8d47065f7d	xen/pciback: Check PF instead of VF for PCI_COMMAND_MEMORY Commit `408fb0e5aa` (xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set) prevented enabling MSI-X on passed-through virtual functions, because it checked the VF for PCI_COMMAND_MEMORY but this is not a valid bit for VFs. Instead, check the physical function for PCI_COMMAND_MEMORY. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-02-15 14:00:34 +00:00
Insu Yun	85c0a87cd1	xen: fix potential integer overflow in queue_reply When len is greater than UINT_MAX - sizeof(*rb), in next allocation, it can overflow integer range and allocates small size of heap. After that, memcpy will overflow the allocated heap. Therefore, it needs to check the size of given length. Signed-off-by: Insu Yun <wuninsu@gmail.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-02-15 13:56:57 +00:00
Juergen Gross	c9e2f531be	xen/scsiback: avoid warnings when adding multiple LUNs to a domain When adding more than one LUN to a frontend a warning for a failed assignment is issued in dom0 for each already existing LUN. Avoid this warning by checking for a LUN already existing when existence is allowed (scsiback_do_add_lun() called with try == 1). As the LUN existence check is needed now for a third time, factor it out into a function. This in turn leads to a more or less complete rewrite of scsiback_del_translation_entry() which will now return a proper error code in case of failure. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-02-08 16:51:36 +00:00
Juergen Gross	f285aa8db7	xen/scsiback: correct frontend counting When adding a new frontend to xen-scsiback don't decrement the number of active frontends in case of no error. Doing so results in a failure when trying to remove the xen-pvscsi nexus even if no domain is using it. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2016-02-08 16:44:40 +00:00
Toshi Kani	782b86641e	xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM Set IORESOURCE_SYSTEM_RAM in struct resource.flags of "System RAM" entries. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: David Vrabel <david.vrabel@citrix.com> # xen Cc: Andrew Banman <abanman@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Gu Zheng <guz.fnst@cn.fujitsu.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-arch@vger.kernel.org Cc: linux-mm <linux-mm@kvack.org> Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1453841853-11383-9-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-01-30 09:49:58 +01:00
Linus Torvalds	2973737012	Merge branch 'stable/for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm Pull cleancache cleanups from Konrad Rzeszutek Wilk: "Simple cleanups" * 'stable/for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/mm: include/linux/cleancache.h: Clean up code cleancache: constify cleancache_ops structure	2016-01-29 15:13:48 -08:00
Julia Lawall	b3c6de492b	cleancache: constify cleancache_ops structure The cleancache_ops structure is never modified, so declare it as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2016-01-27 09:09:57 -05:00
Linus Torvalds	a200dcb346	virtio: barrier rework+fixes This adds a new kind of barrier, and reworks virtio and xen to use it. Plus some fixes here and there. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWlU2kAAoJECgfDbjSjVRpZ6IH/Ra19ecG8sCQo9zskr4zo22Z DZXC3u0sJDBYjjBAiw3IY1FKh7wx2Fr1RhUOj1bteBgcFCMCV1zInP5ITiCyzd1H YYh1w9C2tZaj2T4t9L4hIrAdtIF8fGS+oI2IojXPjOuDLEt6pfFBEjHp/sfl3UJq ZmZvw4OXviSNej7jBw8Xni3Uv18yfmLGXvMdkvMSPC1/XL29voGDqTVwhqJwxLVz k/ZLcKFOzIs9N7Nja0Jl1EiZtC2Y9cpItqweicNAzszlpkSL44vQxmCSefB+WyQ4 gt0O3+AxYkLfrxzCBhUA4IpRex3/XPW1b+1e/V1XjfR2n/FlyLe+AIa8uPJElFc= =ukaV -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio barrier rework+fixes from Michael Tsirkin: "This adds a new kind of barrier, and reworks virtio and xen to use it. Plus some fixes here and there" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (44 commits) checkpatch: add virt barriers checkpatch: check for __smp outside barrier.h checkpatch.pl: add missing memory barriers virtio: make find_vqs() checkpatch.pl-friendly virtio_balloon: fix race between migration and ballooning virtio_balloon: fix race by fill and leak s390: more efficient smp barriers s390: use generic memory barriers xen/events: use virt_xxx barriers xen/io: use virt_xxx barriers xenbus: use virt_xxx barriers virtio_ring: use virt_store_mb sh: move xchg_cmpxchg to a header by itself sh: support 1 and 2 byte xchg virtio_ring: update weak barriers to use virt_xxx Revert "virtio_ring: Update weak barriers to use dma_wmb/rmb" asm-generic: implement virt_xxx memory barriers x86: define __smp_xxx xtensa: define __smp_xxx tile: define __smp_xxx ...	2016-01-18 16:44:24 -08:00
Michael S. Tsirkin	234927540e	xen/events: use virt_xxx barriers drivers/xen/events/events_fifo.c uses rmb() to communicate with the other side. For guests compiled with CONFIG_SMP, smp_rmb would be sufficient, so rmb() here is only needed if a non-SMP guest runs on an SMP host. Switch to the virt_rmb barrier which serves this exact purpose. Pull in asm/barrier.h here to make sure the file is self-contained. Suggested-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-01-12 20:47:04 +02:00
Michael S. Tsirkin	5bb0c9be2a	xenbus: use virt_xxx barriers drivers/xen/xenbus/xenbus_comms.c uses full memory barriers to communicate with the other side. For guests compiled with CONFIG_SMP, smp_wmb and smp_mb would be sufficient, so mb() and wmb() here are only needed if a non-SMP guest runs on an SMP host. Switch to virt_xxx barriers which serve this exact purpose. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>	2016-01-12 20:47:03 +02:00
David Vrabel	a4cdb556ca	xen/gntdev: add ioctl for grant copy Add IOCTL_GNTDEV_GRANT_COPY to allow applications to copy between user space buffers and grant references. This interface is similar to the GNTTABOP_copy hypercall ABI except the local buffers are provided using a virtual address (instead of a GFN and offset). To avoid userspace from having to page align its buffers the driver will use two or more ops if required. If the ioctl returns 0, the application must check the status of each segment with the segments status field. If the ioctl returns a -ve error code (EINVAL or EFAULT), the status of individual ops is undefined. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2016-01-07 13:21:53 +00:00
Julia Lawall	b9c0a92a9a	xen/gntdev: constify mmu_notifier_ops structures This mmu_notifier_ops structure is never modified, so declare it as const, like the other mmu_notifier_ops structures. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-12-21 14:41:01 +00:00
Julia Lawall	86fc213673	xen/grant-table: constify gnttab_ops structure The gnttab_ops structure is never modified, so declare it as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-12-21 14:41:01 +00:00
Stefano Stabellini	2dd887e321	xen/time: use READ_ONCE Use READ_ONCE through the code, rather than explicit barriers. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-12-21 14:41:00 +00:00
Stefano Stabellini	cfafae9403	xen: rename dom0_op to platform_op The dom0_op hypercall has been renamed to platform_op since Xen 3.2, which is ancient, and modern upstream Linux kernels cannot run as dom0 and it anymore anyway. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2015-12-21 14:40:55 +00:00
Stefano Stabellini	4ccefbe597	xen: move xen_setup_runstate_info and get_runstate_snapshot to drivers/xen/time.c Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-21 14:40:52 +00:00
Linus Torvalds	3273cba195	xen: bug fixes for 4.4-rc5 - XSA-155 security fixes to backend drivers. - XSA-157 security fixes to pciback. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWdDrXAAoJEFxbo/MsZsTR3N0H/0Lvz6MWBARCje7livbz7nqE PS0Bea+2yAfNhCDDiDlpV0lor8qlyfWDF6lGhLjItldAzahag3ZDKDf1Z/lcQvhf 3MwFOcOVZE8lLtvLT6LGnPuehi1Mfdi1Qk1/zQhPhsq6+FLPLT2y+whmBihp8mMh C12f7KRg5r3U7eZXNB6MEtGA0RFrOp0lBdvsiZx3qyVLpezj9mIe0NueQqwY3QCS xQ0fILp/x2EnZNZuzgghFTPRxMAx5ReOezgn9Rzvq4aThD+irz1y6ghkYN4rG2s2 tyYOTqBnjJEJEQ+wmYMhnfCwVvDffztG+uI9hqN31QFJiNB0xsjSWFCkDAWchiU= =Argz -----END PGP SIGNATURE----- Merge tag 'for-linus-4.4-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - XSA-155 security fixes to backend drivers. - XSA-157 security fixes to pciback. * tag 'for-linus-4.4-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen-pciback: fix up cleanup path when alloc fails xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set. xen/pciback: For XEN_PCI_OP_disable_msi[\|x] only disable if device has MSI(X) enabled. xen/pciback: Do not install an IRQ handler for MSI interrupts. xen/pciback: Return error on XEN_PCI_OP_enable_msix when device has MSI or MSI-X enabled xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled xen/pciback: Save xen_pci_op commands before processing it xen-scsiback: safely copy requests xen-blkback: read from indirect descriptors only once xen-blkback: only read request operation from shared ring once xen-netback: use RING_COPY_REQUEST() throughout xen-netback: don't use last request to determine minimum Tx credit xen: Add RING_COPY_REQUEST() xen/x86/pvh: Use HVM's flush_tlb_others op xen: Resume PMU from non-atomic context xen/events/fifo: Consume unprocessed events when a CPU dies	2015-12-18 12:24:52 -08:00
Doug Goldstein	584a561a6f	xen-pciback: fix up cleanup path when alloc fails When allocating a pciback device fails, clear the private field. This could lead to an use-after free, however the 'really_probe' takes care of setting dev_set_drvdata(dev, NULL) in its failure path (which we would exercise if the ->probe function failed), so we we are OK. However lets be defensive as the code can change. Going forward we should clean up the pci_set_drvdata(dev, NULL) in the various code-base. That will be for another day. Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reported-by: Jonathan Creekmore <jonathan.creekmore@gmail.com> Signed-off-by: Doug Goldstein <cardoe@cardoe.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-18 11:38:48 -05:00
Konrad Rzeszutek Wilk	408fb0e5aa	xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set. commit `f598282f51` ("PCI: Fix the NIU MSI-X problem in a better way") teaches us that dealing with MSI-X can be troublesome. Further checks in the MSI-X architecture shows that if the PCI_COMMAND_MEMORY bit is turned of in the PCI_COMMAND we may not be able to access the BAR (since they are memory regions). Since the MSI-X tables are located in there.. that can lead to us causing PCIe errors. Inhibit us performing any operation on the MSI-X unless the MEMORY bit is set. Note that Xen hypervisor with: "x86/MSI-X: access MSI-X table only after having enabled MSI-X" will return: xen_pciback: 0000:0a:00.1: error -6 enabling MSI-X for guest 3! When the generic MSI code tries to setup the PIRQ without MEMORY bit set. Which means with later versions of Xen (4.6) this patch is not neccessary. This is part of XSA-157 CC: stable@vger.kernel.org Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-18 10:48:39 -05:00
Konrad Rzeszutek Wilk	7cfb905b96	xen/pciback: For XEN_PCI_OP_disable_msi[\|x] only disable if device has MSI(X) enabled. Otherwise just continue on, returning the same values as previously (return of 0, and op->result has the PIRQ value). This does not change the behavior of XEN_PCI_OP_disable_msi[\|x]. The pci_disable_msi or pci_disable_msix have the checks for msi_enabled or msix_enabled so they will error out immediately. However the guest can still call these operations and cause us to disable the 'ack_intr'. That means the backend IRQ handler for the legacy interrupt will not respond to interrupts anymore. This will lead to (if the device is causing an interrupt storm) for the Linux generic code to disable the interrupt line. Naturally this will only happen if the device in question is plugged in on the motherboard on shared level interrupt GSI. This is part of XSA-157 CC: stable@vger.kernel.org Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-18 10:48:37 -05:00
Konrad Rzeszutek Wilk	a396f3a210	xen/pciback: Do not install an IRQ handler for MSI interrupts. Otherwise an guest can subvert the generic MSI code to trigger an BUG_ON condition during MSI interrupt freeing: for (i = 0; i < entry->nvec_used; i++) BUG_ON(irq_has_action(entry->irq + i)); Xen PCI backed installs an IRQ handler (request_irq) for the dev->irq whenever the guest writes PCI_COMMAND_MEMORY (or PCI_COMMAND_IO) to the PCI_COMMAND register. This is done in case the device has legacy interrupts the GSI line is shared by the backend devices. To subvert the backend the guest needs to make the backend to change the dev->irq from the GSI to the MSI interrupt line, make the backend allocate an interrupt handler, and then command the backend to free the MSI interrupt and hit the BUG_ON. Since the backend only calls 'request_irq' when the guest writes to the PCI_COMMAND register the guest needs to call XEN_PCI_OP_enable_msi before any other operation. This will cause the generic MSI code to setup an MSI entry and populate dev->irq with the new PIRQ value. Then the guest can write to PCI_COMMAND PCI_COMMAND_MEMORY and cause the backend to setup an IRQ handler for dev->irq (which instead of the GSI value has the MSI pirq). See 'xen_pcibk_control_isr'. Then the guest disables the MSI: XEN_PCI_OP_disable_msi which ends up triggering the BUG_ON condition in 'free_msi_irqs' as there is an IRQ handler for the entry->irq (dev->irq). Note that this cannot be done using MSI-X as the generic code does not over-write dev->irq with the MSI-X PIRQ values. The patch inhibits setting up the IRQ handler if MSI or MSI-X (for symmetry reasons) code had been called successfully. P.S. Xen PCIBack when it sets up the device for the guest consumption ends up writting 0 to the PCI_COMMAND (see xen_pcibk_reset_device). XSA-120 addendum patch removed that - however when upstreaming said addendum we found that it caused issues with qemu upstream. That has now been fixed in qemu upstream. This is part of XSA-157 CC: stable@vger.kernel.org Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-18 10:48:34 -05:00
Konrad Rzeszutek Wilk	5e0ce1455c	xen/pciback: Return error on XEN_PCI_OP_enable_msix when device has MSI or MSI-X enabled The guest sequence of: a) XEN_PCI_OP_enable_msix b) XEN_PCI_OP_enable_msix results in hitting an NULL pointer due to using freed pointers. The device passed in the guest MUST have MSI-X capability. The a) constructs and SysFS representation of MSI and MSI groups. The b) adds a second set of them but adding in to SysFS fails (duplicate entry). 'populate_msi_sysfs' frees the newly allocated msi_irq_groups (note that in a) pdev->msi_irq_groups is still set) and also free's ALL of the MSI-X entries of the device (the ones allocated in step a) and b)). The unwind code: 'free_msi_irqs' deletes all the entries and tries to delete the pdev->msi_irq_groups (which hasn't been set to NULL). However the pointers in the SysFS are already freed and we hit an NULL pointer further on when 'strlen' is attempted on a freed pointer. The patch adds a simple check in the XEN_PCI_OP_enable_msix to guard against that. The check for msi_enabled is not stricly neccessary. This is part of XSA-157 CC: stable@vger.kernel.org Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-18 10:48:29 -05:00
Konrad Rzeszutek Wilk	56441f3c8e	xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled The guest sequence of: a) XEN_PCI_OP_enable_msi b) XEN_PCI_OP_enable_msi c) XEN_PCI_OP_disable_msi results in hitting an BUG_ON condition in the msi.c code. The MSI code uses an dev->msi_list to which it adds MSI entries. Under the above conditions an BUG_ON() can be hit. The device passed in the guest MUST have MSI capability. The a) adds the entry to the dev->msi_list and sets msi_enabled. The b) adds a second entry but adding in to SysFS fails (duplicate entry) and deletes all of the entries from msi_list and returns (with msi_enabled is still set). c) pci_disable_msi passes the msi_enabled checks and hits: BUG_ON(list_empty(dev_to_msi_list(&dev->dev))); and blows up. The patch adds a simple check in the XEN_PCI_OP_enable_msi to guard against that. The check for msix_enabled is not stricly neccessary. This is part of XSA-157. CC: stable@vger.kernel.org Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-18 10:48:19 -05:00
Konrad Rzeszutek Wilk	8135cf8b09	xen/pciback: Save xen_pci_op commands before processing it Double fetch vulnerabilities that happen when a variable is fetched twice from shared memory but a security check is only performed the first time. The xen_pcibk_do_op function performs a switch statements on the op->cmd value which is stored in shared memory. Interestingly this can result in a double fetch vulnerability depending on the performed compiler optimization. This patch fixes it by saving the xen_pci_op command before processing it. We also use 'barrier' to make sure that the compiler does not perform any optimization. This is part of XSA155. CC: stable@vger.kernel.org Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-18 10:00:47 -05:00
David Vrabel	be69746ec1	xen-scsiback: safely copy requests The copy of the ring request was lacking a following barrier(), potentially allowing the compiler to optimize the copy away. Use RING_COPY_REQUEST() to ensure the request is copied to local memory. This is part of XSA155. CC: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2015-12-18 10:00:41 -05:00
Ross Lagerwall	3de88d622f	xen/events/fifo: Consume unprocessed events when a CPU dies When a CPU is offlined, there may be unprocessed events on a port for that CPU. If the port is subsequently reused on a different CPU, it could be in an unexpected state with the link bit set, resulting in interrupts being missed. Fix this by consuming any unprocessed events for a particular CPU when that CPU dies. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: <stable@vger.kernel.org> # 3.14+ Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-12-02 13:23:25 +00:00
Linus Torvalds	4fe5e199eb	xen: bug fixes for 4.4-rc2 - Fix gntdev and numa balancing. - Fix x86 boot crash due to unallocated legacy irq descs. - Fix overflow in evtchn device when > 1024 event channels. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWV1coAAoJEFxbo/MsZsTROo8H/1D69XtlQmLrAKWq4JafZrXM rYQoiRxW/yDNoA3whtOcK4TLf/JpA+B1VAoekXqSEG5Mv9YbIH1Su/y4KwF7WaeX xSL812ODeN8iYk8A52Zccw0gdl/emzLesPLuq5UrdDhehYp8vQGtk/CdvZIiQAAc of5Ds9ozIuKTcwDkxOZdUrSG0DvCuvhHBz4xrmuKkbs8CAornfQGBUPKb+vkS05b 2IVzFhCtM2Bhsb8Ji4TfNjsH90T9tghb/QG73APniRMx+hn7CUHkifZ074tnGATp LdXCJ8D5C8WZx0QCklzcBZUpXbwWv9AWyZR8gZqhGUCMh9XGgByC3lqsMGFgwiM= =5872 -----END PGP SIGNATURE----- Merge tag 'for-linus-4.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - Fix gntdev and numa balancing. - Fix x86 boot crash due to unallocated legacy irq descs. - Fix overflow in evtchn device when > 1024 event channels. * tag 'for-linus-4.4-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/evtchn: dynamically grow pending event channel ring xen/events: Always allocate legacy interrupts on PV guests xen/gntdev: Grant maps should not be subject to NUMA balancing	2015-11-26 11:42:25 -08:00
David Vrabel	8620015499	xen/evtchn: dynamically grow pending event channel ring If more than 1024 event channels are bound to a evtchn device then it possible (even with well behaved applications) for the ring to overflow and events to be lost (reported as an -EFBIG error). Dynamically increase the size of the ring so there is always enough space for all bound events. Well behaved applicables that only unmask events after draining them from the ring can thus no longer lose events. However, an application could unmask an event before draining it, allowing multiple entries per port to accumulate in the ring, and a overflow could still occur. So the overflow detection and reporting is retained. The ring size is initially only 64 entries so the common use case of an application only binding a few events will use less memory than before. The ring size may grow to 512 KiB (enough for all 2^17 possible channels). This order 7 kmalloc() may fail due to memory fragmentation, so we fall back to trying vmalloc(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>	2015-11-26 18:49:54 +00:00
Boris Ostrovsky	b4ff8389ed	xen/events: Always allocate legacy interrupts on PV guests After commit `8c058b0b9c` ("x86/irq: Probe for PIC presence before allocating descs for legacy IRQs") early_irq_init() will no longer preallocate descriptors for legacy interrupts if PIC does not exist, which is the case for Xen PV guests. Therefore we may need to allocate those descriptors ourselves. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-11-26 18:05:01 +00:00
Boris Ostrovsky	9c17d96500	xen/gntdev: Grant maps should not be subject to NUMA balancing Doing so will cause the grant to be unmapped and then, during fault handling, the fault to be mistakenly treated as NUMA hint fault. In addition, even if those maps could partcipate in NUMA balancing, it wouldn't provide any benefit since we are unable to determine physical page's node (even if/when VNUMA is implemented). Marking grant maps' VMAs as VM_IO will exclude them from being part of NUMA balancing. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-11-26 17:47:35 +00:00
Linus Torvalds	9aa3d651a9	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "This series contains HCH's changes to absorb configfs attribute ->show() + ->store() function pointer usage from it's original tree-wide consumers, into common configfs code. It includes usb-gadget, target w/ drivers, netconsole and ocfs2 changes to realize the improved simplicity, that now renders the original include/target/configfs_macros.h CPP magic for fabric drivers and others, unnecessary and obsolete. And with common code in place, new configfs attributes can be added easier than ever before. Note, there are further improvements in-flight from other folks for v4.5 code in configfs land, plus number of target fixes for post -rc1 code" In the meantime, a new user of the now-removed old configfs API came in through the char/misc tree in commit `7bd1d4093c` ("stm class: Introduce an abstraction for System Trace Module devices"). This merge resolution comes from Alexander Shishkin, who updated his stm class tracing abstraction to account for the removal of the old show_attribute and store_attribute methods in commit `517982229f` ("configfs: remove old API") from this pull. As Alexander says about that patch: "There's no need to keep an extra wrapper structure per item and the awkward show_attribute/store_attribute item ops are no longer needed. This patch converts policy code to the new api, all the while making the code quite a bit smaller and easier on the eyes. Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>" That patch was folded into the merge so that the tree should be fully bisectable. * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (23 commits) configfs: remove old API ocfs2/cluster: use per-attribute show and store methods ocfs2/cluster: move locking into attribute store methods netconsole: use per-attribute show and store methods target: use per-attribute show and store methods spear13xx_pcie_gadget: use per-attribute show and store methods dlm: use per-attribute show and store methods usb-gadget/f_serial: use per-attribute show and store methods usb-gadget/f_phonet: use per-attribute show and store methods usb-gadget/f_obex: use per-attribute show and store methods usb-gadget/f_uac2: use per-attribute show and store methods usb-gadget/f_uac1: use per-attribute show and store methods usb-gadget/f_mass_storage: use per-attribute show and store methods usb-gadget/f_sourcesink: use per-attribute show and store methods usb-gadget/f_printer: use per-attribute show and store methods usb-gadget/f_midi: use per-attribute show and store methods usb-gadget/f_loopback: use per-attribute show and store methods usb-gadget/ether: use per-attribute show and store methods usb-gadget/f_acm: use per-attribute show and store methods usb-gadget/f_hid: use per-attribute show and store methods ...	2015-11-13 20:04:17 -08:00
Stefano Stabellini	1c7a62137b	xen, cpu_hotplug: call device_offline instead of cpu_down When offlining a cpu, instead of cpu_down, call device_offline, which also takes care of updating the cpu.dev.offline field. This keeps the sysfs file /sys/devices/system/cpu/cpuN/online, up to date. Also move the call to disable_hotplug_cpu, because it makes more sense to have it there. We don't call device_online at cpu-hotplug time, because that would immediately take the cpu online, while we want to retain the current behaviour: the user needs to explicitly enable the cpu after it has been hotplugged. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: konrad.wilk@oracle.com CC: boris.ostrovsky@oracle.com CC: david.vrabel@citrix.com	2015-10-23 14:20:48 +01:00
Stefano Stabellini	a314e3eb84	xen/arm: Enable cpu_hotplug.c Build cpu_hotplug for ARM and ARM64 guests. Rename arch_(un)register_cpu to xen_(un)register_cpu and provide an empty implementation on ARM and ARM64. On x86 just call arch_(un)register_cpu as we are already doing. Initialize cpu_hotplug on ARM. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2015-10-23 14:20:47 +01:00
Julien Grall	89bf4b4e4a	xenbus: Support multiple grants ring with 64KB The PV ring may use multiple grants and expect them to be mapped contiguously in the virtual memory. Although, the current code is relying on a Linux page will be mapped to a single grant. On build where Linux is using a different page size than the grant (i.e other than 4KB), the grant will always be mapped on the first 4KB of each Linux page which make the final ring not contiguous in the memory. This can be fixed by mapping multiple grant in a same Linux page. Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:47 +01:00
Julien Grall	f73314b281	xen/grant-table: Add an helper to iterate over a specific number of grants With the 64KB page granularity support on ARM64, a Linux page may be split accross multiple grant. Currently we have the helper gnttab_foreach_grant_in_grant to break a Linux page based on an offset and a len, but it doesn't fit when we only have a number of grants in hand. Introduce a new helper which take an array of Linux page and a number of grant and will figure out the address of each grant. Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:46 +01:00
Julien Grall	9cce2914e2	xen/xenbus: Rename RING_PAGE to RING_GRANT Linux may use a different page size than the size of grant. So make clear that the order is actually in number of grant. Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:46 +01:00
Julien Grall	3990dd2703	xen/balloon: Use the correct sizeof when declaring frame_list The type of the item in frame_list is xen_pfn_t which is not an unsigned long on ARM but an uint64_t. With the current computation, the size of frame_list will be 2 * PAGE_SIZE rather than PAGE_SIZE. I bet it's just mistake when the type has been switched from "unsigned long" to "xen_pfn_t" in commit `965c0aaafe` "xen: balloon: use correct type for frame_list". Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:44 +01:00
Julien Grall	9435cce879	xen/swiotlb: Add support for 64KB page granularity Swiotlb is used on ARM64 to support DMA on platform where devices are not protected by an SMMU. Furthermore it's only enabled for DOM0. While Xen is always using 4KB page granularity in the stage-2 page table, Linux ARM64 may either use 4KB or 64KB. This means that a Linux page can be spanned accross multiple Xen page. The Swiotlb code has to validate that the buffer used for DMA is physically contiguous in the memory. As a Linux page can't be shared between local memory and foreign page by design (the balloon code always removing entirely a Linux page), the changes in the code are very minimal because we only need to check the first Xen PFN. Note that it may be possible to optimize the function check_page_physically_contiguous to avoid looping over every Xen PFN for local memory. Although I will let this optimization for a follow-up. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:43 +01:00
Julien Grall	291be10fd7	xen/swiotlb: Pass addresses rather than frame numbers to xen_arch_need_swiotlb With 64KB page granularity support, the frame number will be different. It will be easier to modify the behavior in a single place rather than in each caller. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:43 +01:00
Julien Grall	5995a68a62	xen/privcmd: Add support for Linux 64KB page granularity The hypercall interface (as well as the toolstack) is always using 4KB page granularity. When the toolstack is asking for mapping a series of guest PFN in a batch, it expects to have the page map contiguously in its virtual memory. When Linux is using 64KB page granularity, the privcmd driver will have to map multiple Xen PFN in a single Linux page. Note that this solution works on page granularity which is a multiple of 4KB. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:42 +01:00
Julien Grall	5ed5451d99	xen/grant-table: Make it running on 64KB granularity The Xen interface is using 4KB page granularity. This means that each grant is 4KB. The current implementation allocates a Linux page per grant. On Linux using 64KB page granularity, only the first 4KB of the page will be used. We could decrease the memory wasted by sharing the page with multiple grant. It will require some care with the {Set,Clear}ForeignPage macro. Note that no changes has been made in the x86 code because both Linux and Xen will only use 4KB page granularity. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:39 +01:00
Julien Grall	a001c9d95c	xen/events: fifo: Make it running on 64KB granularity Only use the first 4KB of the page to store the events channel info. It means that we will waste 60KB every time we allocate page for: * control block: a page is allocating per CPU * event array: a page is allocating everytime we need to expand it I think we can reduce the memory waste for the 2 areas by: * control block: sharing between multiple vCPUs. Although it will require some bookkeeping in order to not free the page when the CPU goes offline and the other CPUs sharing the page still there * event array: always extend the array event by 64K (i.e 16 4K chunk). That would require more care when we fail to expand the event channel. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:38 +01:00
Julien Grall	30756c6299	xen/balloon: Don't rely on the page granularity is the same for Xen and Linux For ARM64 guests, Linux is able to support either 64K or 4K page granularity. Although, the hypercall interface is always based on 4K page granularity. With 64K page granularity, a single page will be spread over multiple Xen frame. To avoid splitting the page into 4K frame, take advantage of the extent_order field to directly allocate/free chunk of the Linux page size. Note that PVMMU is only used for PV guest (which is x86) and the page granularity is always 4KB. Some BUILD_BUG_ON has been added to ensure that because the code has not been modified. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:38 +01:00
Julien Grall	7d567928db	xen/xenbus: Use Xen page definition All the ring (xenstore, and PV rings) are always based on the page granularity of Xen. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:37 +01:00
Julien Grall	36f8abd36f	xen/biomerge: Don't allow biovec's to be merged when Linux is not using 4KB pages On ARM all dma-capable devices on a same platform may not be protected by an IOMMU. The DMA requests have to use the BFN (i.e MFN on ARM) in order to use correctly the device. While the DOM0 memory is allocated in a 1:1 fashion (PFN == MFN), grant mapping will screw this contiguous mapping. When Linux is using 64KB page granularitary, the page may be split accross multiple non-contiguous MFN (Xen is using 4KB page granularity). Therefore a DMA request will likely fail. Checking that a 64KB page is using contiguous MFN is tedious. For now, always says that biovec are not mergeable. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:36 +01:00
Julien Grall	008c320a96	xen/grant: Introduce helpers to split a page into grant Currently, a grant is always based on the Xen page granularity (i.e 4KB). When Linux is using a different page granularity, a single page will be split between multiple grants. The new helpers will be in charge of splitting the Linux page into grants and call a function given by the caller on each grant. Also provide an helper to count the number of grants within a given contiguous region. Note that the x86/include/asm/xen/page.h is now including xen/interface/grant_table.h rather than xen/grant_table.h. It's necessary because xen/grant_table.h depends on asm/xen/page.h and will break the compilation. Furthermore, only definition in interface/grant_table.h is required. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-10-23 14:20:33 +01:00
David Vrabel	4a69c909de	xen/balloon: pre-allocate p2m entries for ballooned pages Pages returned by alloc_xenballooned_pages() will be used for grant mapping which will call set_phys_to_machine() (in PV guests). Ballooned pages are set as INVALID_P2M_ENTRY in the p2m and thus may be using the (shared) missing tables and a subsequent set_phys_to_machine() will need to allocate new tables. Since the grant mapping may be done from a context that cannot sleep, the p2m entries must already be allocated. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>	2015-10-23 14:20:31 +01:00
David Vrabel	1cf6a6c829	xen/balloon: use hotplugged pages for foreign mappings etc. alloc_xenballooned_pages() is used to get ballooned pages to back foreign mappings etc. Instead of having to balloon out real pages, use (if supported) hotplugged memory. This makes more memory available to the guest and reduces fragmentation in the p2m. This is only enabled if the xen.balloon.hotplug_unpopulated sysctl is set to 1. This sysctl defaults to 0 in case the udev rules to automatically online hotplugged memory do not exist. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> --- v3: - Add xen.balloon.hotplug_unpopulated sysctl to enable use of hotplug for unpopulated pages.	2015-10-23 14:20:05 +01:00
David Vrabel	81b286e0f1	xen/balloon: make alloc_xenballoon_pages() always allocate low pages All users of alloc_xenballoon_pages() wanted low memory pages, so remove the option for high memory. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>	2015-10-23 14:20:05 +01:00
David Vrabel	b2ac6aa8f7	xen/balloon: only hotplug additional memory if required Now that we track the total number of pages (included hotplugged regions), it is easy to determine if more memory needs to be hotplugged. Add a new BP_WAIT state to signal that the balloon process needs to wait until kicked by the memory add notifier (when the new section is onlined by userspace). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> --- v3: - Return BP_WAIT if enough sections are already hotplugged. v2: - New BP_WAIT status after adding new memory sections.	2015-10-23 14:20:04 +01:00
David Vrabel	de5a77d842	xen/balloon: rationalize memory hotplug stats The stats used for memory hotplug make no sense and are fiddled with in odd ways. Remove them and introduce total_pages to track the total number of pages (both populated and unpopulated) including those within hotplugged regions (note that this includes not yet onlined pages). This will be used in a subsequent commit (xen/balloon: only hotplug additional memory if required) when deciding whether additional memory needs to be hotplugged. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>	2015-10-23 14:20:04 +01:00
David Vrabel	55b3da98a4	xen/balloon: find non-conflicting regions to place hotplugged memory Instead of placing hotplugged memory at the end of RAM (which may conflict with PCI devices or reserved regions) use allocate_resource() to get a new, suitably aligned resource that does not conflict. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> --- v3: - Remove stale comment.	2015-10-23 14:20:03 +01:00
Christoph Hellwig	2eafd72939	target: use per-attribute show and store methods This also allows to remove the target-specific old configfs macros, and gets rid of the target_core_fabric_configfs.h header which only had one function declaration left that could be moved to a better place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org> Acked-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>	2015-10-13 22:17:49 -07:00
Linus Torvalds	33e247c7e5	Merge branch 'akpm' (patches from Andrew) Merge third patch-bomb from Andrew Morton: - even more of the rest of MM - lib/ updates - checkpatch updates - small changes to a few scruffy filesystems - kmod fixes/cleanups - kexec updates - a dma-mapping cleanup series from hch * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (81 commits) dma-mapping: consolidate dma_set_mask dma-mapping: consolidate dma_supported dma-mapping: cosolidate dma_mapping_error dma-mapping: consolidate dma_{alloc,free}_noncoherent dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} mm: use vma_is_anonymous() in create_huge_pmd() and wp_huge_pmd() mm: make sure all file VMAs have ->vm_ops set mm, mpx: add "vm_flags_t vm_flags" arg to do_mmap_pgoff() mm: mark most vm_operations_struct const namei: fix warning while make xmldocs caused by namei.c ipc: convert invalid scenarios to use WARN_ON zlib_deflate/deftree: remove bi_reverse() lib/decompress_unlzma: Do a NULL check for pointer lib/decompressors: use real out buf size for gunzip with kernel fs/affs: make root lookup from blkdev logical size sysctl: fix int -> unsigned long assignments in INT_MIN case kexec: export KERNEL_IMAGE_SIZE to vmcoreinfo kexec: align crash_notes allocation to make it be inside one physical page kexec: remove unnecessary test in kimage_alloc_crash_control_pages() kexec: split kexec_load syscall from kexec core code ...	2015-09-10 18:19:42 -07:00
Linus Torvalds	06ab838c20	xen: MFN/GFN/BFN terminology changes for 4.3-rc0 - Use the correct GFN/BFN terms more consistently. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJV8VRMAAoJEFxbo/MsZsTRiGQH/i/jrAJUJfrFC2PINaA2gDwe O0dlrkCiSgAYChGmxxxXZQSPM5Po5+EbT/dLjZ/uvSooeorM9RYY/mFo7ut/qLep 4pyQUuwGtebWGBZTrj9sygUVXVhgJnyoZxskNUbhj9zvP7hb9++IiI78mzne6cpj lCh/7Z2dgpfRcKlNRu+qpzP79Uc7OqIfDK+IZLrQKlXa7IQDJTQYoRjbKpfCtmMV BEG3kN9ESx5tLzYiAfxvaxVXl9WQFEoktqe9V8IgOQlVRLgJ2DQWS6vmraGrokWM 3HDOCHtRCXlPhu1Vnrp0R9OgqWbz8FJnmVAndXT8r3Nsjjmd0aLwhJx7YAReO/4= =JDia -----END PGP SIGNATURE----- Merge tag 'for-linus-4.3-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen terminology fixes from David Vrabel: "Use the correct GFN/BFN terms more consistently" * tag 'for-linus-4.3-rc0b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/xenbus: Rename the variable xen_store_mfn to xen_store_gfn xen/privcmd: Further s/MFN/GFN/ clean-up hvc/xen: Further s/MFN/GFN clean-up video/xen-fbfront: Further s/MFN/GFN clean-up xen/tmem: Use xen_page_to_gfn rather than pfn_to_gfn xen: Use correctly the Xen memory terminologies arm/xen: implement correctly pfn_to_mfn xen: Make clear that swiotlb and biomerge are dealing with DMA address	2015-09-10 16:21:11 -07:00
Christoph Hellwig	6894258eda	dma-mapping: consolidate dma_{alloc,free}_{attrs,coherent} Since 2009 we have a nice asm-generic header implementing lots of DMA API functions for architectures using struct dma_map_ops, but unfortunately it's still missing a lot of APIs that all architectures still have to duplicate. This series consolidates the remaining functions, although we still need arch opt outs for two of them as a few architectures have very non-standard implementations. This patch (of 5): The coherent DMA allocator works the same over all architectures supporting dma_map operations. This patch consolidates them and converges the minor differences: - the debug_dma helpers are now called from all architectures, including those that were previously missing them - dma_alloc_from_coherent and dma_release_from_coherent are now always called from the generic alloc/free routines instead of the ops dma-mapping-common.h always includes dma-coherent.h to get the defintions for them, or the stubs if the architecture doesn't support this feature - checks for ->alloc / ->free presence are removed. There is only one magic instead of dma_map_ops without them (mic_dma_ops) and that one is x86 only anyway. Besides that only x86 needs special treatment to replace a default devices if none is passed and tweak the gfp_flags. An optional arch hook is provided for that. [linux@roeck-us.net: fix build] [jcmvbkbc@gmail.com: fix xtensa] Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Kirill A. Shutemov	7cbea8dc01	mm: mark most vm_operations_struct const With two exceptions (drm/qxl and drm/radeon) all vm_operations_struct structs should be constant. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-09-10 13:29:01 -07:00
Linus Torvalds	752240e74d	xen: features and fixes for 4.3-rc0 - Convert xen-blkfront to the multiqueue API - [arm] Support binding event channels to different VCPUs. - [x86] Support > 512 GiB in a PV guests (off by default as such a guest cannot be migrated with the current toolstack). - [x86] PMU support for PV dom0 (limited support for using perf with Xen and other guests). -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJV7wIdAAoJEFxbo/MsZsTR0hEH/04HTKLKGnSJpZ5WbMPxqZxE UqGlvhvVWNAmFocZmbPcEi9T1qtcFrX5pM55JQr6UmAp3ovYsT2q1Q1kKaOaawks pSfc/YEH3oQW5VUQ9Lm9Ru5Z8Btox0WrzRREO92OF36UOgUOBOLkGsUfOwDinNIM lSk2djbYwDYAsoeC3PHB32wwMI//Lz6B/9ZVXcyL6ULynt1ULdspETjGnptRPZa7 JTB5L4/soioKOn18HDwwOhKmvaFUPQv9Odnv7dc85XwZreajhM/KMu3qFbMDaF/d WVB1NMeCBdQYgjOrUjrmpyr5uTMySiQEG54cplrEKinfeZgKlEyjKvjcAfJfiac= =Ktjl -----END PGP SIGNATURE----- Merge tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from David Vrabel: "Xen features and fixes for 4.3: - Convert xen-blkfront to the multiqueue API - [arm] Support binding event channels to different VCPUs. - [x86] Support > 512 GiB in a PV guests (off by default as such a guest cannot be migrated with the current toolstack). - [x86] PMU support for PV dom0 (limited support for using perf with Xen and other guests)" * tag 'for-linus-4.3-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (33 commits) xen: switch extra memory accounting to use pfns xen: limit memory to architectural maximum xen: avoid another early crash of memory limited dom0 xen: avoid early crash of memory limited dom0 arm/xen: Remove helpers which are PV specific xen/x86: Don't try to set PCE bit in CR4 xen/PMU: PMU emulation code xen/PMU: Intercept PMU-related MSR and APIC accesses xen/PMU: Describe vendor-specific PMU registers xen/PMU: Initialization code for Xen PMU xen/PMU: Sysfs interface for setting Xen PMU mode xen: xensyms support xen: remove no longer needed p2m.h xen: allow more than 512 GB of RAM for 64 bit pv-domains xen: move p2m list if conflicting with e820 map xen: add explicit memblock_reserve() calls for special pages mm: provide early_memremap_ro to establish read-only mapping xen: check for initrd conflicting with e820 map xen: check pre-allocated page tables for conflict with memory map xen: check for kernel memory conflicting with memory layout ...	2015-09-08 11:46:48 -07:00
Julien Grall	5f51042f87	xen/xenbus: Rename the variable xen_store_mfn to xen_store_gfn The variable xen_store_mfn is effectively storing a GFN and not an MFN. Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-09-08 18:03:54 +01:00
Julien Grall	a13d7201d7	xen/privcmd: Further s/MFN/GFN/ clean-up The privcmd code is mixing the usage of GFN and MFN within the same functions which make the code difficult to understand when you only work with auto-translated guests. The privcmd driver is only dealing with GFN so replace all the mention of MFN into GFN. The ioctl structure used to map foreign change has been left unchanged given that the userspace is using it. Nonetheless, add a comment to explain the expected value within the "mfn" field. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-09-08 18:03:54 +01:00
Julien Grall	a76e3cc32d	xen/tmem: Use xen_page_to_gfn rather than pfn_to_gfn All the caller of xen_tmem_{get,put}_page have a struct page * in hand and call pfn_to_gfn for the only benefits of these 2 functions. Rather than passing the pfn in parameter, pass directly the page and use directly xen_page_to_gfn. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-09-08 18:03:52 +01:00
Julien Grall	0df4f266b3	xen: Use correctly the Xen memory terminologies Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN is meant, I suspect this is because the first support for Xen was for PV. This resulted in some misimplementation of helpers on ARM and confused developers about the expected behavior. For instance, with pfn_to_mfn, we expect to get an MFN based on the name. Although, if we look at the implementation on x86, it's returning a GFN. For clarity and avoid new confusion, replace any reference to mfn with gfn in any helpers used by PV drivers. The x86 code will still keep some reference of pfn_to_mfn which may be used by all kind of guests No changes as been made in the hypercall field, even though they may be invalid, in order to keep the same as the defintion in xen repo. Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a name to close to the KVM function gfn_to_page. Take also the opportunity to simplify simple construction such as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up will come in follow-up patches. [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-09-08 18:03:49 +01:00
Julien Grall	32e09870ee	xen: Make clear that swiotlb and biomerge are dealing with DMA address The swiotlb is required when programming a DMA address on ARM when a device is not protected by an IOMMU. In this case, the DMA address should always be equal to the machine address. For DOM0 memory, Xen ensure it by have an identity mapping between the guest address and host address. However, when mapping a foreign grant reference, the 1:1 model doesn't work. For ARM guest, most of the callers of pfn_to_mfn expects to get a GFN (Guest Frame Number), i.e a PFN (Page Frame Number) from the Linux point of view given that all ARM guest are auto-translated. Even though the name pfn_to_mfn is misleading, we need to ensure that those caller get a GFN and not by mistake a MFN. In pratical, I haven't seen error related to this but we should fix it for the sake of correctness. In order to fix the implementation of pfn_to_mfn on ARM in a follow-up patch, we have to introduce new helpers to return the DMA from a PFN and the invert. On x86, the new helpers will be an alias of pfn_to_mfn and mfn_to_pfn. The helpers will be used in swiotlb and xen_biovec_phys_mergeable. This is necessary in the latter because we have to ensure that the biovec code will not try to merge a biovec using foreign page and another using Linux memory. Lastly, the helper mfn_to_local_pfn has been renamed to bfn_to_local_pfn given that the only usage was in swiotlb. Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-09-08 17:10:52 +01:00
Juergen Gross	626d750866	xen: switch extra memory accounting to use pfns Instead of using physical addresses for accounting of extra memory areas available for ballooning switch to pfns as this is much less error prone regarding partial pages. Reported-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-09-08 16:28:06 +01:00
Linus Torvalds	ae98207309	Power management and ACPI material for v4.3-rc1 - ACPICA update to upstream revision 20150818 including method tracing extensions to allow more in-depth AML debugging in the kernel and a number of assorted fixes and cleanups (Bob Moore, Lv Zheng, Markus Elfring). - ACPI sysfs code updates and a documentation update related to AML method tracing (Lv Zheng). - ACPI EC driver fix related to serialized evaluations of _Qxx methods and ACPI tools updates allowing the EC userspace tool to be built from the kernel source (Lv Zheng). - ACPI processor driver updates preparing it for future introduction of CPPC support and ACPI PCC mailbox driver updates (Ashwin Chaugule). - ACPI interrupts enumeration fix for a regression related to the handling of IRQ attribute conflicts between MADT and the ACPI namespace (Jiang Liu). - Fixes related to ACPI device PM (Mika Westerberg, Srinidhi Kasagar). - ACPI device registration code reorganization to separate the sysfs-related code and bus type operations from the rest (Rafael J Wysocki). - Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause, Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss). - ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups (Pan Xinhui, Rafael J Wysocki). - cpufreq core cleanups on top of the previous changes allowing it to preseve its sysfs directories over system suspend/resume (Viresh Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior). - cpufreq fixes and cleanups related to governors (Viresh Kumar). - cpufreq updates (core and the cpufreq-dt driver) related to the turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz). - New DT bindings for Operating Performance Points (OPP), support for them in the OPP framework and in the cpufreq-dt driver plus related OPP framework fixes and cleanups (Viresh Kumar). - cpufreq powernv driver updates (Shilpasri G Bhat). - New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen). - Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean). - intel_pstate driver updates including Skylake-S support, support for enabling HW P-states per CPU and an additional vendor bypass list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao). - cpuidle core fixes related to the handling of coupled idle states (Xunlei Pang). - intel_idle driver updates including Skylake Client support and support for freeze-mode-specific idle states (Len Brown). - Driver core updates related to power management (Andy Shevchenko, Rafael J Wysocki). - Generic power domains framework fixes and cleanups (Jon Hunter, Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson). - Device PM QoS framework update to allow the latency tolerance setting to be exposed to user space via sysfs (Mika Westerberg). - devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas). - System sleep support updates (Alan Stern, Len Brown, SungEun Kim). - rockchip-io AVS support updates (Heiko Stuebner). - PM core clocks support fixup (Colin Ian King). - Power capping RAPL driver update including support for Skylake H/S and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi). - Generic device properties framework fixes related to the handling of static (driver-provided) property sets (Andy Shevchenko). - turbostat and cpupower updates (Len Brown, Shilpasri G Bhat, Shreyas B Prabhu). / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJV5hhGAAoJEILEb/54YlRxs+EQAK51iFk48+IbpHYaZZ50Yo4m ZZc2zBcbwRcBlU9vKERrhG+jieSl8J/JJNxT8vBjKqyvNw038mCjewQh02ol0HuC R7nlDiVJkmZ50sLO4xwE/1UBZr/XqbddwCUnYzvFMkMTA0ePzFtf8BrJ1FXpT8S/ fkwSXQty6hvJDwxkfrbMSaA730wMju9lahx8D6MlmUAedWYZOJDMQKB4WKa/St5X 9uckBPHUBB2KiKlXxdbFPwKLNxHvLROq5SpDLc6cM/7XZB+QfNFy85CUjCUtYo1O 1W8k0qnztvZ6UEv27qz5dejGyAGOarMWGGNsmL9evoeGeHRpQL+dom7HcTnbAfUZ walyhYSm/zKkdy7Vl3xWUUQkMG48+PviMI6K0YhHXb3Rm5wlR/yBNZTwNIty9SX/ fKCHEa8QynWwLxgm53c3xRkiitJxMsHNK03moLD9zQMjshTyTNvpNbZoahyKQzk6 H+9M1DBRHhkkREDWSwGutukxfEMtWe2vcZcyERrFiY7l5k1j58DwDBMPqjPhRv6q P/1NlCzr0XYf83Y86J18LbDuPGDhTjjIEn6CqbtI2mmWqTg3+rF7zvS2ux+FzMnA gisv8l6GT9JiWhxKFqqL/rrVpwtyHebWLYE/RpNUW6fEzLziRNj1qyYO9dqI/GGi I3rfxlXoc/5xJWCgNB8f =fTgI -----END PGP SIGNATURE----- Merge tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI updates from Rafael Wysocki: "From the number of commits perspective, the biggest items are ACPICA and cpufreq changes with the latter taking the lead (over 50 commits). On the cpufreq front, there are many cleanups and minor fixes in the core and governors, driver updates etc. We also have a new cpufreq driver for Mediatek MT8173 chips. ACPICA mostly updates its debug infrastructure and adds a number of fixes and cleanups for a good measure. The Operating Performance Points (OPP) framework is updated with new DT bindings and support for them among other things. We have a few updates of the generic power domains framework and a reorganization of the ACPI device enumeration code and bus type operations. And a lot of fixes and cleanups all over. Included is one branch from the MFD tree as it contains some PM-related driver core and ACPI PM changes a few other commits are based on. Specifics: - ACPICA update to upstream revision 20150818 including method tracing extensions to allow more in-depth AML debugging in the kernel and a number of assorted fixes and cleanups (Bob Moore, Lv Zheng, Markus Elfring). - ACPI sysfs code updates and a documentation update related to AML method tracing (Lv Zheng). - ACPI EC driver fix related to serialized evaluations of _Qxx methods and ACPI tools updates allowing the EC userspace tool to be built from the kernel source (Lv Zheng). - ACPI processor driver updates preparing it for future introduction of CPPC support and ACPI PCC mailbox driver updates (Ashwin Chaugule). - ACPI interrupts enumeration fix for a regression related to the handling of IRQ attribute conflicts between MADT and the ACPI namespace (Jiang Liu). - Fixes related to ACPI device PM (Mika Westerberg, Srinidhi Kasagar). - ACPI device registration code reorganization to separate the sysfs-related code and bus type operations from the rest (Rafael J Wysocki). - Assorted cleanups in the ACPI core (Jarkko Nikula, Mathias Krause, Andy Shevchenko, Rafael J Wysocki, Nicolas Iooss). - ACPI cpufreq driver and ia64 cpufreq driver fixes and cleanups (Pan Xinhui, Rafael J Wysocki). - cpufreq core cleanups on top of the previous changes allowing it to preseve its sysfs directories over system suspend/resume (Viresh Kumar, Rafael J Wysocki, Sebastian Andrzej Siewior). - cpufreq fixes and cleanups related to governors (Viresh Kumar). - cpufreq updates (core and the cpufreq-dt driver) related to the turbo/boost mode support (Viresh Kumar, Bartlomiej Zolnierkiewicz). - New DT bindings for Operating Performance Points (OPP), support for them in the OPP framework and in the cpufreq-dt driver plus related OPP framework fixes and cleanups (Viresh Kumar). - cpufreq powernv driver updates (Shilpasri G Bhat). - New cpufreq driver for Mediatek MT8173 (Pi-Cheng Chen). - Assorted cpufreq driver (speedstep-lib, sfi, integrator) cleanups and fixes (Abhilash Jindal, Andrzej Hajda, Cristian Ardelean). - intel_pstate driver updates including Skylake-S support, support for enabling HW P-states per CPU and an additional vendor bypass list entry (Kristen Carlson Accardi, Chen Yu, Ethan Zhao). - cpuidle core fixes related to the handling of coupled idle states (Xunlei Pang). - intel_idle driver updates including Skylake Client support and support for freeze-mode-specific idle states (Len Brown). - Driver core updates related to power management (Andy Shevchenko, Rafael J Wysocki). - Generic power domains framework fixes and cleanups (Jon Hunter, Geert Uytterhoeven, Rajendra Nayak, Ulf Hansson). - Device PM QoS framework update to allow the latency tolerance setting to be exposed to user space via sysfs (Mika Westerberg). - devfreq support for PPMUv2 in Exynos5433 and a fix for an incorrect exynos-ppmu DT binding (Chanwoo Choi, Javier Martinez Canillas). - System sleep support updates (Alan Stern, Len Brown, SungEun Kim). - rockchip-io AVS support updates (Heiko Stuebner). - PM core clocks support fixup (Colin Ian King). - Power capping RAPL driver update including support for Skylake H/S and Broadwell-H (Radivoje Jovanovic, Seiichi Ikarashi). - Generic device properties framework fixes related to the handling of static (driver-provided) property sets (Andy Shevchenko). - turbostat and cpupower updates (Len Brown, Shilpasri G Bhat, Shreyas B Prabhu)" * tag 'pm+acpi-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (180 commits) cpufreq: speedstep-lib: Use monotonic clock cpufreq: powernv: Increase the verbosity of OCC console messages cpufreq: sfi: use kmemdup rather than duplicating its implementation cpufreq: drop !cpufreq_driver check from cpufreq_parse_governor() cpufreq: rename cpufreq_real_policy as cpufreq_user_policy cpufreq: remove redundant 'policy' field from user_policy cpufreq: remove redundant 'governor' field from user_policy cpufreq: update user_policy.* on success cpufreq: use memcpy() to copy policy cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event cpufreq: mediatek: Add MT8173 cpufreq driver dt-bindings: mediatek: Add MT8173 CPU DVFS clock bindings PM / Domains: Fix typo in description of genpd_dev_pm_detach() PM / Domains: Remove unusable governor dummies PM / Domains: Make pm_genpd_init() available to modules PM / domains: Align column headers and data in pm_genpd_summary output powercap / RAPL: disable the 2nd power limit properly tools: cpupower: Fix error when running cpupower monitor PM / OPP: Drop unlikely before IS_ERR(_OR_NULL) PM / OPP: Fix static checker warning (broken 64bit big endian systems) ...	2015-09-01 19:45:46 -07:00
Linus Torvalds	43af9872f5	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 apic updates from Thomas Gleixner: "This udpate contains: - rework the irq vector array to store a pointer to the irq descriptor instead of the irq number to avoid a lookup of the irq descriptor in the irq entry path - lguest interrupt handling cleanups - conversion of the local apic timer to the new clockevent callbacks - preparatory changes for the irq argument removal of interrupt flow handlers" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/irq: Do not dereference irq descriptor before checking it tools/lguest: Clean up include dir tools/lguest: Fix redefinition of struct virtio_pci_cfg_cap x86/irq: Store irq descriptor in vector array genirq: Provide irq_desc_has_action x86/irq: Get rid of an indentation level x86/irq: Rename VECTOR_UNDEFINED to VECTOR_UNUSED x86/irq: Replace numeric constant x86/irq: Protect smp_cleanup_move x86/lguest: Do not setup unused irq vectors x86/lguest: Clean up lguest_setup_irq x86/apic: Drop local_irq_save/restore in timer callbacks x86/apic: Migrate apic timer to new set_state interface x86/irq: Use access helper irq_data_get_affinity_mask() x86/irq: Use accessor irq_data_get_irq_handler_data() x86/irq: Use accessor irq_data_get_node()	2015-09-01 15:20:51 -07:00
Rafael J. Wysocki	4ffe18c255	Merge branch 'pm-cpufreq' * pm-cpufreq: (53 commits) cpufreq: speedstep-lib: Use monotonic clock cpufreq: powernv: Increase the verbosity of OCC console messages cpufreq: sfi: use kmemdup rather than duplicating its implementation cpufreq: drop !cpufreq_driver check from cpufreq_parse_governor() cpufreq: rename cpufreq_real_policy as cpufreq_user_policy cpufreq: remove redundant 'policy' field from user_policy cpufreq: remove redundant 'governor' field from user_policy cpufreq: update user_policy.* on success cpufreq: use memcpy() to copy policy cpufreq: remove redundant CPUFREQ_INCOMPATIBLE notifier event cpufreq: mediatek: Add MT8173 cpufreq driver dt-bindings: mediatek: Add MT8173 CPU DVFS clock bindings intel_pstate: append more Oracle OEM table id to vendor bypass list intel_pstate: Add SKY-S support intel_pstate: Fix possible overflow complained by Coverity cpufreq: Correct a freq check in cpufreq_set_policy() cpufreq: Lock CPU online/offline in cpufreq_register_driver() cpufreq: Replace recover_policy with new_policy in cpufreq_online() cpufreq: Separate CPU device registration from CPU online cpufreq: powernv: Restore cpu frequency to policy->cur on unthrottling ...	2015-09-01 15:52:35 +02:00
Linus Torvalds	a1d8561172	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The biggest change in this cycle is the rewrite of the main SMP load balancing metric: the CPU load/utilization. The main goal was to make the metric more precise and more representative - see the changelog of this commit for the gory details: `9d89c257df` ("sched/fair: Rewrite runnable load and utilization average tracking") It is done in a way that significantly reduces complexity of the code: 5 files changed, 249 insertions(+), 494 deletions(-) and the performance testing results are encouraging. Nevertheless we need to keep an eye on potential regressions, since this potentially affects every SMP workload in existence. This work comes from Yuyang Du. Other changes: - SCHED_DL updates. (Andrea Parri) - Simplify architecture callbacks by removing finish_arch_switch(). (Peter Zijlstra et al) - cputime accounting: guarantee stime + utime == rtime. (Peter Zijlstra) - optimize idle CPU wakeups some more - inspired by Facebook server loads. (Mike Galbraith) - stop_machine fixes and updates. (Oleg Nesterov) - Introduce the 'trace_sched_waking' tracepoint. (Peter Zijlstra) - sched/numa tweaks. (Srikar Dronamraju) - misc fixes and small cleanups" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits) sched/deadline: Fix comment in enqueue_task_dl() sched/deadline: Fix comment in push_dl_tasks() sched: Change the sched_class::set_cpus_allowed() calling context sched: Make sched_class::set_cpus_allowed() unconditional sched: Fix a race between __kthread_bind() and sched_setaffinity() sched: Ensure a task has a non-normalized vruntime when returning back to CFS sched/numa: Fix NUMA_DIRECT topology identification tile: Reorganize _switch_to() sched, sparc32: Update scheduler comments in copy_thread() sched: Remove finish_arch_switch() sched, tile: Remove finish_arch_switch sched, sh: Fold finish_arch_switch() into switch_to() sched, score: Remove finish_arch_switch() sched, avr32: Remove finish_arch_switch() sched, MIPS: Get rid of finish_arch_switch() sched, arm: Remove finish_arch_switch() sched/fair: Clean up load average references sched/fair: Provide runnable_load_avg back to cfs_rq sched/fair: Remove task and group entity load when they are dead sched/fair: Init cfs_rq's sched_entity load average ...	2015-08-31 20:26:22 -07:00
Boris Ostrovsky	5f14154882	xen/PMU: Sysfs interface for setting Xen PMU mode Set Xen's PMU mode via /sys/hypervisor/pmu/pmu_mode. Add XENPMU hypercall. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:26 +01:00
Boris Ostrovsky	a11f4f0a4e	xen: xensyms support Export Xen symbols to dom0 via /proc/xen/xensyms (similar to /proc/kallsyms). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:25 +01:00
Julien Grall	4a5b69464e	xen/events: Support event channel rebind on ARM Currently, the event channel rebind code is gated with the presence of the vector callback. The virtual interrupt controller on ARM has the concept of per-CPU interrupt (PPI) which allow us to support per-VCPU event channel. Therefore there is no need of vector callback for ARM. Xen is already using a free PPI to notify the guest VCPU of an event. Furthermore, the xen code initialization in Linux (see arch/arm/xen/enlighten.c) is requesting correctly a per-CPU IRQ. Introduce new helper xen_support_evtchn_rebind to allow architecture decide whether rebind an event is support or not. It will always return true on ARM and keep the same behavior on x86. This is also allow us to drop the usage of xen_have_vector_callback entirely in the ARM code. Signed-off-by: Julien Grall <julien.grall@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:15 +01:00
Konstantin Khlebnikov	a7da51ae10	xen/preempt: use need_resched() instead of should_resched() This code is used only when CONFIG_PREEMPT=n and only in non-atomic context: xen_in_preemptible_hcall is set only in privcmd_ioctl_hypercall(). Thus preempt_count is zero and should_resched() is equal to need_resched(). Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-20 12:24:14 +01:00
Linus Torvalds	6b476e1140	xen: bug fixes for 4.2-rc6 - Revert a fix from 4.2-rc5 that was causing lots of WARNING spam. - Fix a memory leak affecting backends in HVM guests. - Fix PV domU hang with certain configurations. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJVzHsAAAoJEFxbo/MsZsTR9Y0H/2j1PHt29RPcNdgGQ84AH0Wh tw1emL8rMcdhWQnsO7bNmywNNvRNQnU3ZJ8dzoq+5GPikNsbfQzYc7U2pIL4A+gB AAJsNDNzecuq4srk8vNxcmZ7ySvm9w6dccDUex2ge3sNWaq6gzSQvz6FSWiL0Sxg k3JcnemEg6JrYOTWdxKInAORMcRO6rgx9eIsdPUPOpgC5XLg6/mZOqBAWXIksDvs V9uCMqQicaUgBgKFIOSllqH6fcCNooRu3aDwNNj/2mMcJmEvMeBkHmNlQgEm2j5L ubdDyrC5y48TUPJm8i3+W2/AY+kgWzhThcqyVy6LRAAj5RItJFxMf0nMXzIEqlQ= =UgMy -----END PGP SIGNATURE----- Merge tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - revert a fix from 4.2-rc5 that was causing lots of WARNING spam. - fix a memory leak affecting backends in HVM guests. - fix PV domU hang with certain configurations. * tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/xenbus: Don't leak memory when unmapping the ring on HVM backend Revert "xen/events/fifo: Handle linked events when closing a port" x86/xen: build "Xen PV" APIC driver for domU as well	2015-08-13 13:36:22 -07:00
Julien Grall	c22fe519e7	xen/xenbus: Don't leak memory when unmapping the ring on HVM backend The commit `ccc9d90a9a` "xenbus_client: Extend interface to support multi-page ring" removes the call to free_xenballooned_pages() in xenbus_unmap_ring_vfree_hvm(), leaking a page for every shared ring. Only with backends running in HVM domains were affected. Signed-off-by: Julien Grall <julien.grall@citrix.com> Cc: <stable@vger.kernel.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-11 11:05:58 +01:00
David Vrabel	ad6cd7bafc	Revert "xen/events/fifo: Handle linked events when closing a port" This reverts commit `fcdf31a7c1`. This was causing a WARNING whenever a PIRQ was closed since shutdown_pirq() is called with irqs disabled. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: <stable@vger.kernel.org>	2015-08-11 11:05:42 +01:00
Linus Torvalds	1ddc6dd855	xen: bug fixes for 4.2-rc5 - Don't lose interrupts when offlining CPUs. - Fix gntdev oops during unmap. - Drop the balloon lock occasionally to allow domain create/destroy. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJVwNFHAAoJEFxbo/MsZsTRZ0oH/1PBpbABz2iK7SZc75paORkm BwVlSTn1Z6ftmRjC8r5BS5KHOrQRsDf8cZnDptCIWpm+uWTXAfeZP5HaH1bA6qMy d+T7d9QMlC5nsCxebfduXYl4AHl7fupblBi3y8CmrJdVW0aASyL7roAtkSS23rXl ND3288juhI6E0Y5kch3b7yip5vjJtKFR7Mw3RkAZO5ihdx30NCYMdqBRBFcKT0Tp s901VXo+87ZSutY12BcToqCwr9E0y2oBdkDIQ5Q7rUlqKM1ifLVOYbx4sxXvNQtk 3WYcAMuBrwfcBP00xYHt18ozEaJxQ2bTOokhZ//2p6LLnhzo0ZiZcxNRX3jda8A= =niy/ -----END PGP SIGNATURE----- Merge tag 'for-linus-4.2-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen bug fixes from David Vrabel: - don't lose interrupts when offlining CPUs - fix gntdev oops during unmap - drop the balloon lock occasionally to allow domain create/destroy * tag 'for-linus-4.2-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/events/fifo: Handle linked events when closing a port xen: release lock occasionally during ballooning xen/gntdevt: Fix race condition in gntdev_release()	2015-08-04 08:49:08 -07:00
Ross Lagerwall	fcdf31a7c1	xen/events/fifo: Handle linked events when closing a port An event channel bound to a CPU that was offlined may still be linked on that CPU's queue. If this event channel is closed and reused, subsequent events will be lost because the event channel is never unlinked and thus cannot be linked onto the correct queue. When a channel is closed and the event is still linked into a queue, ensure that it is unlinked before completing. If the CPU to which the event channel bound is online, spin until the event is handled by that CPU. If that CPU is offline, it can't handle the event, so clear the event queue during the close, dropping the events. This fixes the missing interrupts (and subsequent disk stalls etc.) when offlining a CPU. Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-08-04 15:41:59 +01:00
Konstantin Khlebnikov	0fa2f5cb2b	sched/preempt, xen: Use need_resched() instead of should_resched() This code is used only when CONFIG_PREEMPT=n and only in non-atomic context: xen_in_preemptible_hcall is set only in privcmd_ioctl_hypercall(). Thus preempt_count is zero and should_resched() is equal to need_resched(). Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Graf <agraf@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150715095201.12246.49283.stgit@buzz Signed-off-by: Ingo Molnar <mingo@kernel.org>	2015-08-03 12:21:23 +02:00
Rafael J. Wysocki	b2f8dc4ce6	ACPI / processor: Drop an unused argument of a cleanup routine acpi_processor_unregister_performance() actually doesn't use its first argument, so drop it and update the callers accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>	2015-07-22 22:11:16 +02:00
Juergen Gross	929423fa83	xen: release lock occasionally during ballooning When dom0 is being ballooned balloon_process() will hold the balloon mutex until it is finished. This will block e.g. creation of new domains as the device backends for the new domain need some autoballooned pages for the ring buffers. Avoid this by releasing the balloon mutex from time to time during ballooning. Adjust the comment above balloon_process() regarding multiple instances of balloon_process(). Instead of open coding it, just use cond_resched(). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2015-07-20 13:37:05 +01:00

1 2 3 4 5 ...

1344 Commits