linux

Author	SHA1	Message	Date
Tahsin Erdogan	b410aff2bd	block: do not allow updates through sysfs until registration completes When a new disk shows up, sysfs queue directory is created before elevator is registered. This allows a user to attempt a scheduler switch even though the initial registration hasn't completed yet. In one scenario, blk_register_queue() calls elv_register_queue() and right before cfq_registered_queue() is called, another process executes elevator_switch() and replaces q->elevator with deadline scheduler. When cfq_registered_queue() executes it interprets e->elevator_data as struct cfq_data even though it is actually struct deadline_data. Grab q->sysfs_lock in blk_register_queue() to synchronize with sysfs callers. Signed-off-by: Tahsin Erdogan <tahsin@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-02-15 08:40:04 -07:00
Yinghai Lu	afe3e4d11b	PCI/PME: Restore pcie_pme_driver.remove In addition to making PME non-modular, `d7def20400` ("PCI/PME: Make explicitly non-modular") removed the pcie_pme_driver .remove() method, pcie_pme_remove(). pcie_pme_remove() freed the PME IRQ that was requested in pci_pme_probe(). The fact that we don't free the IRQ after `d7def20400` causes the following crash when removing a PCIe port device via /sys: ------------[ cut here ]------------ kernel BUG at drivers/pci/msi.c:370! invalid opcode: 0000 [#1] SMP Modules linked in: CPU: 1 PID: 14509 Comm: sh Tainted: G W 4.8.0-rc1-yh-00012-gd29438d RIP: 0010:[<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190 ... Call Trace: [<ffffffff9758cda4>] pci_disable_msi+0x34/0x40 [<ffffffff97583817>] cleanup_service_irqs+0x27/0x30 [<ffffffff97583e9a>] pcie_port_device_remove+0x2a/0x40 [<ffffffff97584250>] pcie_portdrv_remove+0x40/0x50 [<ffffffff97576d7b>] pci_device_remove+0x4b/0xc0 [<ffffffff9785ebe6>] __device_release_driver+0xb6/0x150 [<ffffffff9785eca5>] device_release_driver+0x25/0x40 [<ffffffff975702e4>] pci_stop_bus_device+0x74/0xa0 [<ffffffff975704ea>] pci_stop_and_remove_bus_device_locked+0x1a/0x30 [<ffffffff97578810>] remove_store+0x50/0x70 [<ffffffff9785a378>] dev_attr_store+0x18/0x30 [<ffffffff97260b64>] sysfs_kf_write+0x44/0x60 [<ffffffff9725feae>] kernfs_fop_write+0x10e/0x190 [<ffffffff971e13f8>] __vfs_write+0x28/0x110 [<ffffffff970b0fa4>] ? percpu_down_read+0x44/0x80 [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0 [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0 [<ffffffff971e1f04>] vfs_write+0xc4/0x180 [<ffffffff971e3089>] SyS_write+0x49/0xa0 [<ffffffff97001a46>] do_syscall_64+0xa6/0x1b0 [<ffffffff9819201e>] entry_SYSCALL64_slow_path+0x25/0x25 ... RIP [<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190 RSP <ffff89ad3085bc48> ---[ end trace f4505e1dac5b95d3 ]--- Segmentation fault Restore pcie_pme_remove(). [bhelgaas: changelog] Fixes: `d7def20400` ("PCI/PME: Make explicitly non-modular") Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> CC: stable@vger.kernel.org # v4.9+	2017-02-15 09:39:32 -06:00
Uma Shankar	06a20d2d2d	drm/i915: Fix PLL 8x/3 divider for MIPI video mode MIPI Video Mode for high res panels (requiring dual link), need a 8X/3 divider to be programmed as 0x2. Modifying the same in this patch. Signed-off-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Vidya Srinivas <vidya.srinivas@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1486551058-22596-3-git-send-email-vidya.srinivas@intel.com	2017-02-15 17:32:57 +02:00
Masami Hiramatsu	bef5da6059	tracing/probes: Fix a warning message to show correct maximum length Since tracing/*probe_events will accept a probe definition up to 4096 - 2 ('\n' and '\0') bytes, it must show 4094 instead of 4096 in warning message. Note that there is one possible case of exceed 4094. If user prepare 4096 bytes null-terminated string and syscall write it with the count == 4095, then it can be accepted. However, if user puts a '\n' after that, it must rejected. So IMHO, the warning message should indicate shorter one, since it is safer. Link: http://lkml.kernel.org/r/148673290462.2579.7966778294009665632.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2017-02-15 10:32:48 -05:00
Uma Shankar	645a2f6e03	drm/i915: Check for platform specific GPIO config Panel GPIO control should be done based on platform. Add a check to restrict VLV and CHT specific GPIO confirguration, so that they dont apply to other platforms. The VBT spec fails to mention the PMIC backlight control option is valid only for VLV/CHT, and the field may be set to "PMIC" for BXT even if PMIC is not desired or possible. Signed-off-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Vidya Srinivas <vidya.srinivas@intel.com> [Jani: amended commit message a bit and fixed indentation.] Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1486551058-22596-2-git-send-email-vidya.srinivas@intel.com	2017-02-15 17:32:26 +02:00
Matias Bjørling	6732c74010	lightnvm: set default lun range when no luns are specified The create target ioctl takes a lun begin and lun end parameter, which defines the range of luns to initialize a target with. If the user does not set the parameters, it default to only using lun 0. Instead, defaults to use all luns in the OCSSD, as it is the usual behaviour users want. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-02-15 08:27:21 -07:00
Matias Bjørling	0e5ffd1cb5	lightnvm: fix off-by-one error on target initialization If one specifies the end lun id to be the absolute number of luns, without taking zero indexing into account, the lightnvm core will pass the off-by-one end lun id to target creation, which then panics during nvm_ioctl_dev_create. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-02-15 08:27:19 -07:00
Manasi Navare	d7e8ef02a6	drm/i915/dp: Reset the link params on HPD/connected boot/resume The max link parameters should be set/reset only on HPD or connected boot case or on system resume. Add a flag reset_link_params to intel_dp to decide when to reset the max link parameters. This prevents the parameters from getting reset/overwritten through all other connector->funcs->detect() calls. This is important when link training fails and the max link params are modified to the lower fallback values. Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1486515251-23469-1-git-send-email-manasi.d.navare@intel.com	2017-02-15 17:24:10 +02:00
Ralph Sennhauser	04be87a83d	ARM: dts: armada-385-linksys: fix DSA compatible property The switch to the new DSA binding used "marvell,mv88e6095" for the compatible property which doesn't exist, use "marvell,mv88e6085" instead. Fixes: `455b82f03f` ("ARM: dts: armada-385-linksys: Utilize new DSA binding") Signed-off-by: Ralph Sennhauser <ralph.sennhauser@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>	2017-02-15 15:54:13 +01:00
Rob Herring	b85ad49409	of: introduce of_graph_get_remote_node The OF graph API leaves too much of the graph walking to clients when in many cases the driver doesn't care about accessing the port or endpoint nodes. The drivers typically just want the device connected via a particular graph connection. of_graph_get_remote_node provides this functionality. Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Philipp Zabel <p.zabel@pengutronix.de>	2017-02-15 08:53:32 -06:00
Moni Shoua	6df6b4a9ce	IB/cma: Destination and source addr families must match The destination address in a listening rdma_id does not have an address family. Since address family in both sides of a connection must be the same in rdma_bind_addr() we set the address family of the destination to the address family of the source. This patch serves the logic in cma_port_is_unique() which requires to know if destination address that is associated with a rdma_id is any address (cma_zero_addr() and cma_loopback_addr()). This can happen when port reuse is checked for a port number that is being listened to. Fixes: `19b752a19d` ("IB/cma: Allow port reuse for rdma_id") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-15 09:51:33 -05:00
Majd Dibbiny	89052d784b	IB/cma: Add default RoCE TOS to CMA configfs Add new entry to the RDMA-CM configfs that allows users to select default TOS for RDMA-CM QPs. This is useful for users that want to control the TOS for legacy applications without changing their code. Application that sets the TOS explicitly using the rdma_set_option API will continue to work as expected, meaning overriding the configfs value. CC: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-15 09:51:28 -05:00
Parav Pandit	5903960840	IB/core: Remove pointer casting from void to net_device This patch avoids unnecessary type casting from void to net_device. CC: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-15 09:51:28 -05:00
Erez Shitrit	2b0841766a	IB/IPoIB: Add destination address when re-queue packet When sending packet to destination that was not resolved yet via path query, the driver keeps the skb and tries to re-send it again when the path is resolved. But when re-sending via dev_queue_xmit the kernel doesn't call to dev_hard_header, so IPoIB needs to keep 20 bytes in the skb and to put the destination address inside them. In that way the dev_start_xmit will have the correct destination, and the driver won't take the destination from the skb->data, while nothing exists there, which causes to packet be be dropped. The test flow is: 1. Run the SM on remote node, 2. Restart the driver. 4. Ping some destination, 3. Observe that first ICMP request will be dropped. Fixes: `fc791b6335` ("IB/ipoib: move back IB LL address into the hard header") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-15 09:51:28 -05:00
Chris Packham	e7d08bcfe9	ARM: dts: Fix typo in armada-xp-98dx4251 The compatible should be 98dx4251 not 98dx4521. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>	2017-02-15 15:33:44 +01:00
Eli Cohen	cdbe33d0f8	IB/mlx5: Fix configuration of port capabilities When the "ib_virt" cap is set, configuration of port capabilities need to be done through mlx5_core_modify_hca_vport_context. Since modify_hca_vport_context accepts mask and value, there is no need to read the port capabilities and calculate the new cap values so we avoid the mutex when ib_virt is set. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>	2017-02-15 09:29:37 -05:00
David Hildenbrand	681bcea802	KVM: svm: inititalize hash table structures directly The hashtable and guarding spinlock are global data structures, we can inititalize them statically. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170124212116.4568-1-david@redhat.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 15:26:06 +01:00
Arnaldo Carvalho de Melo	34a0548f01	perf tools: Add missing parse_events_error() prototype As pointed out by clang, we were not providing a prototype for a function before using it: util/parse-events.y:699:6: error: conflicting types for 'parse_events_error' void parse_events_error(YYLTYPE loc, void data, ^ /tmp/build/perf/util/parse-events-bison.c:2224:7: note: previous implicit declaration is here yyerror (&yylloc, _data, scanner, YY_("syntax error")); ^ /tmp/build/perf/util/parse-events-bison.c:65:25: note: expanded from macro 'yyerror' #define yyerror parse_events_error 1 error generated. One line fix it. Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170215130605.GC4020@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-02-15 11:20:49 -03:00
Wei Yongjun	8f0994bb8c	tracing: Fix return value check in trace_benchmark_reg() In case of error, the function kthread_run() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Link: http://lkml.kernel.org/r/20170112135502.28556-1-weiyj.lk@gmail.com Cc: stable@vger.kernel.org Fixes: `81dc9f0e` ("tracing: Add tracepoint benchmark tracepoint") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2017-02-15 09:02:27 -05:00
Arnd Bergmann	eb583cd484	tracing: Use modern function declaration We get a lot of harmless warnings about this header file at W=1 level because of an unusual function declaration: kernel/trace/trace.h:766:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration] This moves the inline statement where it normally belongs, avoiding the warning. Link: http://lkml.kernel.org/r/20170123122521.3389010-1-arnd@arndb.de Fixes: `4046bf023b` ("ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2017-02-15 09:02:27 -05:00
Jason Baron	3821fd35b5	jump_label: Reduce the size of struct static_key The static_key->next field goes mostly unused. The field is used for associating module uses with a static key. Most uses of struct static_key define a static key in the core kernel and make use of it entirely within the core kernel, or define the static key in a module and make use of it only from within that module. In fact, of the ~3,000 static keys defined, I found only about 5 or so that did not fit this pattern. Thus, we can remove the static_key->next field entirely and overload the static_key->entries field. That is, when all the static_key uses are contained within the same module, static_key->entries continues to point to those uses. However, if the static_key uses are not contained within the module where the static_key is defined, then we allocate a struct static_key_mod, store a pointer to the uses within that struct static_key_mod, and have the static key point at the static_key_mod. This does incur some extra memory usage when a static_key is used in a module that does not define it, but since there are only a handful of such cases there is a net savings. In order to identify if the static_key->entries pointer contains a struct static_key_mod or a struct jump_entry pointer, bit 1 of static_key->entries is set to 1 if it points to a struct static_key_mod and is 0 if it points to a struct jump_entry. We were already using bit 0 in a similar way to store the initial value of the static_key. This does mean that allocations of struct static_key_mod and that the struct jump_entry tables need to be at least 4-byte aligned in memory. As far as I can tell all arches meet this criteria. For my .config, the patch increased the text by 778 bytes, but reduced the data + bss size by 14912, for a net savings of 14,134 bytes. text data bss dec hex filename 8092427 5016512 790528 13899467 d416cb vmlinux.pre 8093205 5001600 790528 13885333 d3df95 vmlinux.post Link: http://lkml.kernel.org/r/1486154544-4321-1-git-send-email-jbaron@akamai.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Joe Perches <joe@perches.com> Signed-off-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2017-02-15 09:02:26 -05:00
Masami Hiramatsu	7257634135	tracing/probe: Show subsystem name in messages Show "trace_probe:", "trace_kprobe:" and "trace_uprobe:" headers for each warning/error/info message. This will help people to notice that kprobe/uprobe events caused those messages. Link: http://lkml.kernel.org/r/148646647813.24658.16705315294927615333.stgit@devbox Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2017-02-15 09:02:25 -05:00
Luiz Capitulino	8e0f11429a	tracing/hwlat: Update old comment about migration The ftrace hwlat does support a cpumask. Link: http://lkml.kernel.org/r/20170213122517.6e211955@redhat.com Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2017-02-15 09:02:25 -05:00
Thomas Gleixner	8a58a34ba4	timers: Make flags output in the timer_start tracepoint useful The timer flags in the timer_start trace event contain lots of useful information, but the meaning is not clear in the trace output. Making tools rely on the bit positions is bad as they might change over time. Decode the flags in the print out. Tools can retrieve the bits and their meaning from the trace format file. Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1702101639290.4036@nanos Requested-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2017-02-15 09:02:24 -05:00
Kalle Valo	b065d3f59f	Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git ath.git patches for 4.11. Major changes: ath10k * when trying older firmware versions don't confuse user with error messages ath9k * fix crash in AP mode (regression) * fix relayfs crash (regression) * fix initialisation with AR9340 and AR9550	2017-02-15 16:01:04 +02:00
Steven Rostedt (VMware)	1f9b3546cf	tracing: Have traceprobe_probes_write() not access userspace unnecessarily The code in traceprobe_probes_write() reads up to 4096 bytes from userpace for each line. If userspace passes in several lines to execute, the code will do a large read for each line, even though, it is highly likely that the first read from userspace received all of the lines at once. I changed the logic to do a single read from userspace, and to only read from userspace again if not all of the read from userspace made it in. I tested this by adding printk()s and writing files that would test -1, ==, and +1 the buffer size, to make sure that there's no overflows and that if a single line is written with +1 the buffer size, that it fails properly. Link: http://lkml.kernel.org/r/20170209180458.5c829ab2@gandalf.local.home Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>	2017-02-15 09:00:55 -05:00
Jim Mattson	858e25c06f	kvm: nVMX: Refactor nested_vmx_run() Nested_vmx_run is split into two parts: the part that handles the VMLAUNCH/VMRESUME instruction, and the part that modifies the vcpu state to transition from VMX root mode to VMX non-root mode. The latter will be used when restoring the checkpointed state of a vCPU that was in VMX operation when a snapshot was taken. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:56:36 +01:00
Jim Mattson	ca0bde28f2	kvm: nVMX: Split VMCS checks from nested_vmx_run() The checks performed on the contents of the vmcs12 are extracted from nested_vmx_run so that they can be used to validate a vmcs12 that has been restored from a checkpoint. Signed-off-by: Jim Mattson <jmattson@google.com> [Change prepare_vmcs02 and nested_vmx_load_cr3's last argument to u32, to match check_vmentry_postreqs. Update comments for singlestep handling. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:56:35 +01:00
Jim Mattson	6beb7bd52e	kvm: nVMX: Refactor nested_get_vmcs12_pages() Perform the checks on vmcs12 state early, but defer the gpa->hpa lookups until after prepare_vmcs02. Later, when we restore the checkpointed state of a vCPU in guest mode, we will not be able to do the gpa->hpa lookups when the restore is done. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:56:11 +01:00
Jim Mattson	a8bc284eb7	kvm: nVMX: Refactor handle_vmptrld() Handle_vmptrld is split into two parts: the part that handles the VMPTRLD instruction, and the part that establishes the current VMCS pointer. The latter will be used when restoring the checkpointed state of a vCPU that had a valid VMCS pointer when a snapshot was taken. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:54:37 +01:00
Jim Mattson	e29acc55bf	kvm: nVMX: Refactor handle_vmon() Handle_vmon is split into two parts: the part that handles the VMXON instruction, and the part that modifies the vcpu state to transition from legacy mode to VMX operation. The latter will be used when restoring the checkpointed state of a vCPU that was in VMX operation when a snapshot was taken. Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:54:37 +01:00
Jim Mattson	cf8b84f48a	kvm: nVMX: Prepare for checkpointing L2 state Split prepare_vmcs12 into two parts: the part that stores the current L2 guest state and the part that sets up the exit information fields. The former will be used when checkpointing the vCPU's VMX state. Modify prepare_vmcs02 so that it can construct a vmcs02 midway through L2 execution, using the checkpointed L2 guest state saved into the cached vmcs12 above. Signed-off-by: Jim Mattson <jmattson@google.com> [Rebasing: add from_vmentry argument to prepare_vmcs02 instead of using vmx->nested.nested_run_pending, because it is no longer 1 at the point prepare_vmcs02 is called. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:54:36 +01:00
Paolo Bonzini	b95234c840	kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection Since `bf9f6ac8d7` ("KVM: Update Posted-Interrupts Descriptor when vCPU is blocked", 2015-09-18) the posted interrupt descriptor is checked unconditionally for PIR.ON. Therefore we don't need KVM_REQ_EVENT to trigger the scan and, if NMIs or SMIs are not involved, we can avoid the complicated event injection path. Calling kvm_vcpu_kick if PIR.ON=1 is also useless, though it has been there since APICv was introduced. However, without the KVM_REQ_EVENT safety net KVM needs to be much more careful about races between vmx_deliver_posted_interrupt and vcpu_enter_guest. First, the IPI for posted interrupts may be issued between setting vcpu->mode = IN_GUEST_MODE and disabling interrupts. If that happens, kvm_trigger_posted_interrupt returns true, but smp_kvm_posted_intr_ipi doesn't do anything about it. The guest is entered with PIR.ON, but the posted interrupt IPI has not been sent and the interrupt is only delivered to the guest on the next vmentry (if any). To fix this, disable interrupts before setting vcpu->mode. This ensures that the IPI is delayed until the guest enters non-root mode; it is then trapped by the processor causing the interrupt to be injected. Second, the IPI may be issued between kvm_x86_ops->sync_pir_to_irr(vcpu) and vcpu->mode = IN_GUEST_MODE. In this case, kvm_vcpu_kick is called but it (correctly) doesn't do anything because it sees vcpu->mode == OUTSIDE_GUEST_MODE. Again, the guest is entered with PIR.ON but no posted interrupt IPI is pending; this time, the fix for this is to move the RVI update after IN_GUEST_MODE. Both issues were mostly masked by the liberal usage of KVM_REQ_EVENT, though the second could actually happen with VT-d posted interrupts. In both race scenarios KVM_REQ_EVENT would cancel guest entry, resulting in another vmentry which would inject the interrupt. This saves about 300 cycles on the self_ipi_* tests of vmexit.flat. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:54:36 +01:00
Paolo Bonzini	76dfafd536	KVM: x86: do not scan IRR twice on APICv vmentry Calls to apic_find_highest_irr are scanning IRR twice, once in vmx_sync_pir_from_irr and once in apic_search_irr. Change sync_pir_from_irr to get the new maximum IRR from kvm_apic_update_irr; now that it does the computation, it can also do the RVI write. In order to avoid complications in svm.c, make the callback optional. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:54:35 +01:00
Paolo Bonzini	3d92789f69	KVM: vmx: move sync_pir_to_irr from apic_find_highest_irr to callers Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:54:34 +01:00
Paolo Bonzini	810e6defcc	KVM: x86: preparatory changes for APICv cleanups Add return value to __kvm_apic_update_irr/kvm_apic_update_irr. Move vmx_sync_pir_to_irr around. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:54:34 +01:00
Paolo Bonzini	0ad3bed6c5	kvm: nVMX: move nested events check to kvm_vcpu_running vcpu_run calls kvm_vcpu_running, not kvm_arch_vcpu_runnable, and the former does not call check_nested_events. Once KVM_REQ_EVENT is removed from the APICv interrupt injection path, however, this would leave no place to trigger a vmexit from L2 to L1, causing a missed interrupt delivery while in guest mode. This is caught by the "ack interrupt on exit" test in vmx.flat. [This does not change the calls to check_nested_events in inject_pending_event. That is material for a separate cleanup.] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:54:33 +01:00
Paolo Bonzini	967235d320	KVM: vmx: clear pending interrupts on KVM_SET_LAPIC Pending interrupts might be in the PI descriptor when the LAPIC is restored from an external state; we do not want them to be injected. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:54:33 +01:00
Paolo Bonzini	db1c056cee	kvm: vmx: Use the hardware provided GPA instead of page walk As in the SVM patch, the guest physical address is passed by VMX to x86_emulate_instruction already, so mark the GPA as available in vcpu->arch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-15 14:54:32 +01:00
Maarten Lankhorst	1592364de3	drm: Resurrect atomic rmfb code, v3 This was somehow lost between v3 and the merged version in Maarten's patch merged as: commit `f2d580b9a8` Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Date: Wed May 4 14:38:26 2016 +0200 drm/core: Do not preserve framebuffer on rmfb, v4. This introduces a slight behavioral change to rmfb. Instead of disabling a crtc when the primary plane is disabled, we try to preserve it. Apart from old versions of the vmwgfx xorg driver, there is nothing depending on rmfb disabling a crtc. Since vmwgfx is a legacy driver we can safely only disable the plane with atomic. If this commit is rejected by the driver then we will still fall back to the old behavior and turn off the crtc. v2: - Remove plane->fb assignment, done by drm_atomic_clean_old_fb. - Add WARN_ON when atomic_remove_fb fails. - Always call drm_atomic_state_put. v3: - Use drm_drv_uses_atomic_modeset - Handle the case where the first plane-disable-only commit fails with -EINVAL. Some drivers do not support this, fall back to disabling all crtc's in this case. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/66fc3da5-697b-1613-0a67-a5293209f0dc@linux.intel.com	2017-02-15 15:40:57 +02:00
Arnaldo Carvalho de Melo	b30a7d1fc9	perf pmu: Fix check for unset alias->unit array The alias->unit field is an array, so to check that it is not set we should see if it is an empty string, i.e. alias->unit[0], instead of checking alias->unit != NULL, as this will _always_ evaluate to 'true'. Pointed out by clang. Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20170214182435.GD4458@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-02-15 10:06:20 -03:00
Mark Rutland	ffe7afd171	arm64/kprobes: consistently handle MRS/MSR with XZR Now that we have XZR-safe helpers for fiddling with registers, use these in the arm64 kprobes code rather than open-coding the logic. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-02-15 12:20:29 +00:00
Mark Rutland	521c646108	arm64: cpufeature: correctly handle MRS to XZR In emulate_mrs() we may erroneously write back to the user SP rather than XZR if we trap an MRS instruction where Xt == 31. Use the new pt_regs_write_reg() helper to handle this correctly. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: `77c97b4ee2` ("arm64: cpufeature: Expose CPUID registers by emulation") Cc: Andre Przywara <andre.przywara@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-02-15 12:20:29 +00:00
Mark Rutland	8b6e70fccf	arm64: traps: correctly handle MRS/MSR with XZR Currently we hand-roll XZR-safe register handling in user_cache_maint_handler(), though we forget to do the same in ctr_read_handler(), and may erroneously write back to the user SP rather than XZR. Use the new helpers to handle these cases correctly and consistently. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: `116c81f427` ("arm64: Work around systems with mismatched cache line sizes") Cc: Andre Przywara <andre.przywara@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-02-15 12:20:29 +00:00
Mark Rutland	6c23e2ff70	arm64: ptrace: add XZR-safe regs accessors In A64, XZR and the SP share the same encoding (31), and whether an instruction accesses XZR or SP for a particular register parameter depends on the definition of the instruction. We store the SP in pt_regs::regs[31], and thus when emulating instructions, we must be careful to not erroneously read from or write back to the saved SP. Unfortunately, we often fail to be this careful. In all cases, instructions using a transfer register parameter Xt use this to refer to XZR rather than SP. This patch adds helpers so that we can more easily and consistently handle these cases. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-02-15 12:20:29 +00:00
Arnd Bergmann	f705d95463	arm64: include asm/assembler.h in entry-ftrace.S In a randconfig build I ran into this build error: arch/arm64/kernel/entry-ftrace.S: Assembler messages: arch/arm64/kernel/entry-ftrace.S:101: Error: unknown mnemonic `ldr_l' -- `ldr_l x2,ftrace_trace_function' The macro is defined in asm/assembler.h, so we should include that file. Fixes: `829d2bd133` ("arm64: entry-ftrace.S: avoid open-coded {adr,ldr}_l") Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-02-15 11:34:25 +00:00
Arnd Bergmann	12f043ff2b	arm64: fix warning about swapper_pg_dir overflow With 4 levels of 16KB pages, we get this warning about the fact that we are copying a whole page into an array that is declared as having only two pointers for the top level of the page table: arch/arm64/mm/mmu.c: In function 'paging_init': arch/arm64/mm/mmu.c:528:2: error: 'memcpy' writing 16384 bytes into a region of size 16 overflows the destination [-Werror=stringop-overflow=] This is harmless since we actually reserve a whole page in the definition of the array that comes from, and just the extern declaration is short. The pgdir is initialized to zero either way, so copying the actual entries here seems like the best solution. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will.deacon@arm.com>	2017-02-15 11:32:18 +00:00
Paolo Bonzini	ee10689117	Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD This brings in two fixes for potential host crashes, from Ben Herrenschmidt and Nick Piggin.	2017-02-15 12:30:20 +01:00
Rashmica Gupta	10d4cf188a	powerpc/asm: Define STACK_PT_REGS_OFFSET macro in asm-offsets.c There are quite a few entries in asm-offests.c which look like: DEFINE(REG, STACK_FRAME_OVERHEAD+offsetof(struct pt_regs, reg)); So define a macro to do it once. Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> [mpe: Rename to STACK_PT_REGS_OFFSET for excruciating explicitness] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-02-15 21:47:51 +11:00
Sergey Senozhatsky	f222449c9d	timekeeping: Use deferred printk() in debug code We cannot do printk() from tk_debug_account_sleep_time(), because tk_debug_account_sleep_time() is called under tk_core seq lock. The reason why printk() is unsafe there is that console_sem may invoke scheduler (up()->wake_up_process()->activate_task()), which, in turn, can return back to timekeeping code, for instance, via get_time()->ktime_get(), deadlocking the system on tk_core seq lock. [ 48.950592] ====================================================== [ 48.950622] [ INFO: possible circular locking dependency detected ] [ 48.950622] 4.10.0-rc7-next-20170213+ #101 Not tainted [ 48.950622] ------------------------------------------------------- [ 48.950622] kworker/0:0/3 is trying to acquire lock: [ 48.950653] (tk_core){----..}, at: [<c01cc624>] retrigger_next_event+0x4c/0x90 [ 48.950683] but task is already holding lock: [ 48.950683] (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90 [ 48.950714] which lock already depends on the new lock. [ 48.950714] the existing dependency chain (in reverse order) is: [ 48.950714] -> #5 (hrtimer_bases.lock){-.-...}: [ 48.950744] _raw_spin_lock_irqsave+0x50/0x64 [ 48.950775] lock_hrtimer_base+0x28/0x58 [ 48.950775] hrtimer_start_range_ns+0x20/0x5c8 [ 48.950775] __enqueue_rt_entity+0x320/0x360 [ 48.950805] enqueue_rt_entity+0x2c/0x44 [ 48.950805] enqueue_task_rt+0x24/0x94 [ 48.950836] ttwu_do_activate+0x54/0xc0 [ 48.950836] try_to_wake_up+0x248/0x5c8 [ 48.950836] __setup_irq+0x420/0x5f0 [ 48.950836] request_threaded_irq+0xdc/0x184 [ 48.950866] devm_request_threaded_irq+0x58/0xa4 [ 48.950866] omap_i2c_probe+0x530/0x6a0 [ 48.950897] platform_drv_probe+0x50/0xb0 [ 48.950897] driver_probe_device+0x1f8/0x2cc [ 48.950897] __driver_attach+0xc0/0xc4 [ 48.950927] bus_for_each_dev+0x6c/0xa0 [ 48.950927] bus_add_driver+0x100/0x210 [ 48.950927] driver_register+0x78/0xf4 [ 48.950958] do_one_initcall+0x3c/0x16c [ 48.950958] kernel_init_freeable+0x20c/0x2d8 [ 48.950958] kernel_init+0x8/0x110 [ 48.950988] ret_from_fork+0x14/0x24 [ 48.950988] -> #4 (&rt_b->rt_runtime_lock){-.-...}: [ 48.951019] _raw_spin_lock+0x40/0x50 [ 48.951019] rq_offline_rt+0x9c/0x2bc [ 48.951019] set_rq_offline.part.2+0x2c/0x58 [ 48.951049] rq_attach_root+0x134/0x144 [ 48.951049] cpu_attach_domain+0x18c/0x6f4 [ 48.951049] build_sched_domains+0xba4/0xd80 [ 48.951080] sched_init_smp+0x68/0x10c [ 48.951080] kernel_init_freeable+0x160/0x2d8 [ 48.951080] kernel_init+0x8/0x110 [ 48.951080] ret_from_fork+0x14/0x24 [ 48.951110] -> #3 (&rq->lock){-.-.-.}: [ 48.951110] _raw_spin_lock+0x40/0x50 [ 48.951141] task_fork_fair+0x30/0x124 [ 48.951141] sched_fork+0x194/0x2e0 [ 48.951141] copy_process.part.5+0x448/0x1a20 [ 48.951171] _do_fork+0x98/0x7e8 [ 48.951171] kernel_thread+0x2c/0x34 [ 48.951171] rest_init+0x1c/0x18c [ 48.951202] start_kernel+0x35c/0x3d4 [ 48.951202] 0x8000807c [ 48.951202] -> #2 (&p->pi_lock){-.-.-.}: [ 48.951232] _raw_spin_lock_irqsave+0x50/0x64 [ 48.951232] try_to_wake_up+0x30/0x5c8 [ 48.951232] up+0x4c/0x60 [ 48.951263] __up_console_sem+0x2c/0x58 [ 48.951263] console_unlock+0x3b4/0x650 [ 48.951263] vprintk_emit+0x270/0x474 [ 48.951293] vprintk_default+0x20/0x28 [ 48.951293] printk+0x20/0x30 [ 48.951324] kauditd_hold_skb+0x94/0xb8 [ 48.951324] kauditd_thread+0x1a4/0x56c [ 48.951324] kthread+0x104/0x148 [ 48.951354] ret_from_fork+0x14/0x24 [ 48.951354] -> #1 ((console_sem).lock){-.....}: [ 48.951385] _raw_spin_lock_irqsave+0x50/0x64 [ 48.951385] down_trylock+0xc/0x2c [ 48.951385] __down_trylock_console_sem+0x24/0x80 [ 48.951385] console_trylock+0x10/0x8c [ 48.951416] vprintk_emit+0x264/0x474 [ 48.951416] vprintk_default+0x20/0x28 [ 48.951416] printk+0x20/0x30 [ 48.951446] tk_debug_account_sleep_time+0x5c/0x70 [ 48.951446] __timekeeping_inject_sleeptime.constprop.3+0x170/0x1a0 [ 48.951446] timekeeping_resume+0x218/0x23c [ 48.951477] syscore_resume+0x94/0x42c [ 48.951477] suspend_enter+0x554/0x9b4 [ 48.951477] suspend_devices_and_enter+0xd8/0x4b4 [ 48.951507] enter_state+0x934/0xbd4 [ 48.951507] pm_suspend+0x14/0x70 [ 48.951507] state_store+0x68/0xc8 [ 48.951538] kernfs_fop_write+0xf4/0x1f8 [ 48.951538] __vfs_write+0x1c/0x114 [ 48.951538] vfs_write+0xa0/0x168 [ 48.951568] SyS_write+0x3c/0x90 [ 48.951568] __sys_trace_return+0x0/0x10 [ 48.951568] -> #0 (tk_core){----..}: [ 48.951599] lock_acquire+0xe0/0x294 [ 48.951599] ktime_get_update_offsets_now+0x5c/0x1d4 [ 48.951629] retrigger_next_event+0x4c/0x90 [ 48.951629] on_each_cpu+0x40/0x7c [ 48.951629] clock_was_set_work+0x14/0x20 [ 48.951660] process_one_work+0x2b4/0x808 [ 48.951660] worker_thread+0x3c/0x550 [ 48.951660] kthread+0x104/0x148 [ 48.951690] ret_from_fork+0x14/0x24 [ 48.951690] other info that might help us debug this: [ 48.951690] Chain exists of: tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock [ 48.951721] Possible unsafe locking scenario: [ 48.951721] CPU0 CPU1 [ 48.951721] ---- ---- [ 48.951721] lock(hrtimer_bases.lock); [ 48.951751] lock(&rt_b->rt_runtime_lock); [ 48.951751] lock(hrtimer_bases.lock); [ 48.951751] lock(tk_core); [ 48.951782] * DEADLOCK * [ 48.951782] 3 locks held by kworker/0:0/3: [ 48.951782] #0: ("events"){.+.+.+}, at: [<c0156590>] process_one_work+0x1f8/0x808 [ 48.951812] #1: (hrtimer_work){+.+...}, at: [<c0156590>] process_one_work+0x1f8/0x808 [ 48.951843] #2: (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90 [ 48.951843] stack backtrace: [ 48.951873] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.10.0-rc7-next-20170213+ [ 48.951904] Workqueue: events clock_was_set_work [ 48.951904] [<c0110208>] (unwind_backtrace) from [<c010c224>] (show_stack+0x10/0x14) [ 48.951934] [<c010c224>] (show_stack) from [<c04ca6c0>] (dump_stack+0xac/0xe0) [ 48.951934] [<c04ca6c0>] (dump_stack) from [<c019b5cc>] (print_circular_bug+0x1d0/0x308) [ 48.951965] [<c019b5cc>] (print_circular_bug) from [<c019d2a8>] (validate_chain+0xf50/0x1324) [ 48.951965] [<c019d2a8>] (validate_chain) from [<c019ec18>] (__lock_acquire+0x468/0x7e8) [ 48.951995] [<c019ec18>] (__lock_acquire) from [<c019f634>] (lock_acquire+0xe0/0x294) [ 48.951995] [<c019f634>] (lock_acquire) from [<c01d0ea0>] (ktime_get_update_offsets_now+0x5c/0x1d4) [ 48.952026] [<c01d0ea0>] (ktime_get_update_offsets_now) from [<c01cc624>] (retrigger_next_event+0x4c/0x90) [ 48.952026] [<c01cc624>] (retrigger_next_event) from [<c01e4e24>] (on_each_cpu+0x40/0x7c) [ 48.952056] [<c01e4e24>] (on_each_cpu) from [<c01cafc4>] (clock_was_set_work+0x14/0x20) [ 48.952056] [<c01cafc4>] (clock_was_set_work) from [<c015664c>] (process_one_work+0x2b4/0x808) [ 48.952087] [<c015664c>] (process_one_work) from [<c0157774>] (worker_thread+0x3c/0x550) [ 48.952087] [<c0157774>] (worker_thread) from [<c015d644>] (kthread+0x104/0x148) [ 48.952087] [<c015d644>] (kthread) from [<c0107830>] (ret_from_fork+0x14/0x24) Replace printk() with printk_deferred(), which does not call into the scheduler. Fixes: `0bf43f15db` ("timekeeping: Prints the amounts of time spent during suspend") Reported-and-tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: John Stultz <john.stultz@linaro.org> Cc: "[4.9+]" <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2017-02-15 11:46:08 +01:00

... 122 123 124 125 126 ...

665443 Commits