linux

Author	SHA1	Message	Date
Tejun Heo	81c173cb5e	kernfs: remove KERNFS_REMOVED KERNFS_REMOVED is used to mark half-initialized and dying nodes so that they don't show up in lookups and deny adding new nodes under or renaming it; however, its role overlaps that of deactivation. It's necessary to deny addition of new children while removal is in progress; however, this role considerably intersects with deactivation - KERNFS_REMOVED prevents new children while deactivation prevents new file operations. There's no reason to have them separate making things more complex than necessary. This patch removes KERNFS_REMOVED. * Instead of KERNFS_REMOVED, each node now starts its life deactivated. This means that we now use both atomic_add() and atomic_sub() on KN_DEACTIVATED_BIAS, which is INT_MIN. The compiler generates an overflow warnings when negating INT_MIN as the negation can't be represented as a positive number. Nothing is actually broken but let's bump BIAS by one to avoid the warnings for archs which negates the subtrahend.. * A new helper kernfs_active() which tests whether kn->active >= 0 is added for convenience and lockdep annotation. All KERNFS_REMOVED tests are replaced with negated kernfs_active() tests. * __kernfs_remove() is updated to deactivate, but not drain, all nodes in the subtree instead of setting KERNFS_REMOVED. This removes deactivation from kernfs_deactivate(), which is now renamed to kernfs_drain(). * Sanity check on KERNFS_REMOVED in kernfs_put() is replaced with checks on the active ref. * Some comment style updates in the affected area. v2: Reordered before removal path restructuring. kernfs_active() dropped and kernfs_get/put_active() used instead. RB_EMPTY_NODE() used in the lookup paths. v3: Reverted most of v2 except for creating a new node with KN_DEACTIVATED_BIAS. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:42:41 -08:00
Tejun Heo	182fd64b66	kernfs: remove KERNFS_ACTIVE_REF and add kernfs_lockdep() There currently are two mechanisms gating active ref lockdep annotations - KERNFS_LOCKDEP flag and KERNFS_ACTIVE_REF type mask. The former disables lockdep annotations in kernfs_get/put_active() while the latter disables all of kernfs_deactivate(). While KERNFS_ACTIVE_REF also behaves as an optimization to skip the deactivation step for non-file nodes, the benefit is marginal and it needlessly diverges code paths. Let's drop KERNFS_ACTIVE_REF. While at it, add a test helper kernfs_lockdep() to test KERNFS_LOCKDEP flag so that it's more convenient and the related code can be compiled out when not enabled. v2: Refreshed on top of ("kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP flag"). As the earlier patch already added KERNFS_LOCKDEP tests to kernfs_deactivate(), those additions are dropped from this patch and the existing ones are simply converted to kernfs_lockdep(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:42:40 -08:00
Tejun Heo	988cd7afb3	kernfs: remove kernfs_addrm_cxt kernfs_addrm_cxt and the accompanying kernfs_addrm_start/finish() were added because there were operations which should be performed outside kernfs_mutex after adding and removing kernfs_nodes. The necessary operations were recorded in kernfs_addrm_cxt and performed by kernfs_addrm_finish(); however, after the recent changes which relocated deactivation and unmapping so that they're performed directly during removal, the only operation kernfs_addrm_finish() performs is kernfs_put(), which can be moved inside the removal path too. This patch moves the kernfs_put() of the base ref to __kernfs_remove() and remove kernfs_addrm_cxt and kernfs_addrm_start/finish(). * kernfs_add_one() is updated to grab and release kernfs_mutex itself. sysfs_addrm_start/finish() invocations around it are removed from all users. * __kernfs_remove() puts an unlinked node directly instead of chaining it to kernfs_addrm_cxt. Its callers are updated to grab and release kernfs_mutex instead of calling kernfs_addrm_start/finish() around it. v2: Rebased on top of "kernfs: associate a new kernfs_node with its parent on creation" which dropped @parent from kernfs_add_one(). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:42:40 -08:00
Tejun Heo	abd54f028e	kernfs: replace kernfs_node->u.completion with kernfs_root->deactivate_waitq kernfs_node->u.completion is used to notify deactivation completion from kernfs_put_active() to kernfs_deactivate(). We now allow multiple racing removals of the same node and the current removal scheme is no longer correct - kernfs_remove() invocation may return before the node is properly deactivated if it races against another removal. The removal path will be restructured to address the issue. To help such restructure which requires supporting multiple waiters, this patch replaces kernfs_node->u.completion with kernfs_root->deactivate_waitq. This makes deactivation event notifications share a per-root waitqueue_head; however, the wait path is quite cold and this will also allow shaving one pointer off kernfs_node. v2: Refreshed on top of ("kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP flag"). Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:42:40 -08:00
David Fries	ac8f73305e	connector: add portid to unicast in addition to broadcasting This allows replying only to the requestor portid while still supporting broadcasting. Pass 0 to portid for the previous behavior. Signed-off-by: David Fries <David@Fries.net> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:40:17 -08:00
Sudeep Dutt	3b1cc9b962	misc: mic: fix possible signed underflow (undefined behavior) in userspace API iovcnt is declared as a signed integer in both the userspace API and as a local variable in mic_virtio.c. The while() loop in mic_virtio.c iterates until the local variable iovcnt reaches the value 0. If userspace passes e.g. INT_MIN as iovcnt field, this loop then appears to depend on an undefined behavior (signed underflow) to complete. The fix is to use unsigned integers in both the userspace API and the local variable. This issue was reported @ https://lkml.org/lkml/2014/1/10/10 Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:30:34 -08:00
K. Y. Srinivasan	8a7206a89f	Drivers: hv: vmbus: Support per-channel driver state As we implement Virtual Receive Side Scaling on the networking side (the VRSS patches are currently under review), it will be useful to have per-channel state that vmbus drivers can manage. Add support for managing per-channel state. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:22:40 -08:00
K. Y. Srinivasan	011a7c3cc3	Drivers: hv: vmbus: Cleanup the packet send path The current channel code is using scatterlist abstraction to pass data to the ringbuffer API on the send path. This causes unnecessary translations between virtual and physical addresses. Fix this. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:22:40 -08:00
K. Y. Srinivasan	90f3453585	Drivers: hv: vmbus: Extract the mmio information from DSDT On Gen2 firmware, Hyper-V does not emulate the PCI bus. However, the MMIO information is packaged up in DSDT. Extract this information and export it for use by the synthetic framebuffer driver. This is the only driver that needs this currently. In this version of the patch mmio, I have updated the hyperv header file (linux/hyperv.h) with mmio definitions. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:21:48 -08:00
Bjarke Istrup Pedersen	5267cf02c7	hv: Add hyperv.h to uapi headers This patch adds the hyperv.h header to the uapi folder, and adds it to the Kbuild file. Doing this enables compiling userspace Hyper-V tools using the installed headers. Version 2: Split UAPI parts into new header, instead of duplicating. Signed-off-by: Bjarke Istrup Pedersen <gurligebis@gentoo.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 15:18:31 -08:00
Sarah Sharp	3d4b81eda2	Revert "usb: xhci: Link TRB must not occur within a USB payload burst" This reverts commit `35773dac5f`. It's a hack that caused regressions in the usb-storage and userspace USB drivers that use usbfs and libusb. Commit 70cabb7d992f "xhci 1.0: Limit arbitrarily-aligned scatter gather." should fix the issues seen with the ax88179_178a driver on xHCI 1.0 hosts, without causing regressions. Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Cc: stable@vger.kernel.org # 3.12	2014-02-07 14:30:03 -08:00
H. Peter Anvin	a3b072cd18	Merge tag 'efi-urgent' into x86/urgent * Avoid WARN_ON() when mapping BGRT on Baytrail (EFI 32-bit). Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-02-07 11:27:30 -08:00
Christoph Hellwig	72a0a36e28	blk-mq: support at_head inserations for blk_execute_rq This is neede for proper SG_IO operation as well as various uses of blk_execute_rq from the SCSI midlayer. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2014-02-07 11:58:54 -07:00
Valentina Manea	b7945b77cd	staging: usbip: convert usbip-host driver to usb_device_driver This driver was previously an interface driver. Since USB/IP exports a whole device, not just an interface, it would make sense to be a device driver. This patch also modifies the way userspace sees and uses a shared device: * the usbip_status file is no longer created for interface 0, but for the whole device (such as /sys/devices/pci0000:00/0000:00:01.2/usb1/1-1/usbip_status). * per interface information, such as interface class or protocol, is no longer sent/received; only device specific information is transmitted. * since the driver was moved one level below in the USB architecture, there is no need to bind/unbind each interface, just the device as a whole. Signed-off-by: Valentina Manea <valentina.manea.m@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 10:54:30 -08:00
Pavel Machek	91eef3e2fe	staging/bluetooth: Add hci_h4p driver Add hci_h4p bluetooth driver to staging tree. This device is used for example on Nokia N900 cell phone. Signed-off-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Pavel Machek <pavel@ucw.cz> Thanks-to: Sebastian Reichel <sre@debian.org> Thanks-to: Joe Perches <joe@perches.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 10:13:53 -08:00
Roger Pau Monne	80bfa2f6e2	xen-blkif: drop struct blkif_request_segment_aligned This was wrongly introduced in commit `402b27f9`, the only difference between blkif_request_segment_aligned and blkif_request_segment is that the former has a named padding, while both share the same memory layout. Also correct a few minor glitches in the description, including for it to no longer assume PAGE_SIZE == 4096. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> [Description fix by Jan Beulich] Signed-off-by: Jan Beulich <jbeulich@suse.com> Reported-by: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Matt Rushton <mrushton@amazon.com> Cc: Matt Wilson <msw@amazon.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-02-07 13:03:53 -05:00
K. Y. Srinivasan	e28bab4828	Drivers: hv: vmbus: Specify the target CPU that should receive notification During the initial VMBUS connect phase, starting with WS2012 R2, we should specify the VPCU in the guest that should receive the notification. Fix this issue. This fix is required to properly connect to the host in the kexeced kernel. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: <stable@vger.kernel.org> [3.9+] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-02-07 08:27:34 -08:00
Thomas Gleixner	f1689bb7ab	time: Fixup fallout from recent clockevent/tick changes Make the stub function static inline instead of static and move the clockevents related function into the proper ifdeffed section. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Soren Brinkmann <soren.brinkmann@xilinx.com> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>	2014-02-07 16:00:46 +01:00
Preeti U Murthy	5d1638acb9	tick: Introduce hrtimer based broadcast On some architectures, in certain CPU deep idle states the local timers stop. An external clock device is used to wakeup these CPUs. The kernel support for the wakeup of these CPUs is provided by the tick broadcast framework by using the external clock device as the wakeup source. However not all implementations of architectures provide such an external clock device. This patch includes support in the broadcast framework to handle the wakeup of the CPUs in deep idle states on such systems by queuing a hrtimer on one of the CPUs, which is meant to handle the wakeup of CPUs in deep idle states. This patchset introduces a pseudo clock device which can be registered by the archs as tick_broadcast_device in the absence of a real external clock device. Once registered, the broadcast framework will work as is for these architectures as long as the archs take care of the BROADCAST_ENTER notification failing for one of the CPUs. This CPU is made the stand by CPU to handle wakeup of the CPUs in deep idle and it must not enter deep idle states. The CPU with the earliest wakeup is chosen to be this CPU. Hence this way the stand by CPU dynamically moves around and so does the hrtimer which is queued to trigger at the next earliest wakeup time. This is consistent with the case where an external clock device is present. The smp affinity of this clock device is set to the CPU with the earliest wakeup. This patchset handles the hotplug of the stand by CPU as well by moving the hrtimer on to the CPU handling the CPU_DEAD notification. Originally-from: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: deepthi@linux.vnet.ibm.com Cc: paulmck@linux.vnet.ibm.com Cc: fweisbec@gmail.com Cc: paulus@samba.org Cc: srivatsa.bhat@linux.vnet.ibm.com Cc: svaidy@linux.vnet.ibm.com Cc: peterz@infradead.org Cc: benh@kernel.crashing.org Cc: rafael.j.wysocki@intel.com Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20140207080632.17187.80532.stgit@preeti.in.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-07 15:34:29 +01:00
Preeti U Murthy	da7e6f45c3	time: Change the return type of clockevents_notify() to integer The broadcast framework can potentially be made use of by archs which do not have an external clock device as well. Then, it is required that one of the CPUs need to handle the broadcasting of wakeup IPIs to the CPUs in deep idle. As a result its local timers should remain functional all the time. For such a CPU, the BROADCAST_ENTER notification has to fail indicating that its clock device cannot be shutdown. To make way for this support, change the return type of tick_broadcast_oneshot_control() and hence clockevents_notify() to indicate such scenarios. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: deepthi@linux.vnet.ibm.com Cc: paulmck@linux.vnet.ibm.com Cc: fweisbec@gmail.com Cc: paulus@samba.org Cc: srivatsa.bhat@linux.vnet.ibm.com Cc: svaidy@linux.vnet.ibm.com Cc: peterz@infradead.org Cc: benh@kernel.crashing.org Cc: rafael.j.wysocki@intel.com Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20140207080606.17187.78306.stgit@preeti.in.ibm.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-02-07 15:34:29 +01:00
Steven Whitehouse	44aaada9d1	GFS2: Add meta readahead field in directory entries The intent of this new field in the directory entry is to allow a subsequent lookup to know how many blocks, which are contiguous with the inode, contain metadata which relates to the inode. This will then allow the issuing of a single read to read these blocks, rather than reading the inode first, and then issuing a second read for the metadata. This only works under some fairly strict conditions, since we do not have back pointers from inodes to directory entries we must ensure that the blocks referenced in this way will always belong to the inode. This rules out being able to use this system for indirect blocks, as these can change as a result of truncate/rewrite. So the idea here is to restrict this to xattr blocks only for the time being. For most inodes, that means only a single block. Also, when using ACLs and/or SELinux or other LSMs, these will be added at inode creation time so that they will be contiguous with the inode on disk and also will almost always be needed when we read the inode in for permissions checks. Once an xattr block for an inode is allocated, it will never change until the inode is deallocated. This patch adds the new field, a further patch will add the readahead in due course. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2014-02-07 11:23:22 +00:00
Mauro Carvalho Chehab	37e59f876b	[media, edac] Change my email address There are several left overs with my old email address. Remove their occurrences and add myself at CREDITS, to allow people to be able to reach me on my new addresses. Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>	2014-02-07 08:03:07 -02:00
Philipp Zabel	ef70bbe1aa	gpio: make gpiod_direction_output take a logical value The documentation was not clear about whether gpio_direction_output should take a logical value or the physical level on the output line, i.e. whether the ACTIVE_LOW status would be taken into account. This converts gpiod_direction_output to use the logical level and adds a new gpiod_direction_output_raw for the raw value. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2014-02-07 09:47:02 +01:00
Eli Cohen	78c0f98cc9	IB/mlx5: Fix binary compatibility with libmlx5 Commit `c1be5232d2` ("Fix micro UAR allocator") broke binary compatibility between libmlx5 and mlx5_ib since it defines a different value to the number of micro UARs per page, leading to wrong calculation in libmlx5. This patch defines struct mlx5_ib_alloc_ucontext_req_v2 as an extension to struct mlx5_ib_alloc_ucontext_req. The extended size is determined in mlx5_ib_alloc_ucontext() and in case of old library we use uuarn 0 which works fine -- this is acheived due to create_user_qp() falling back from high to medium then to low class where low class will return 0. For new libraries we use the more sophisticated allocation algorithm. Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-02-06 23:00:48 -08:00
Jan Moskyto Matejka	ee262ad827	inet: defines IPPROTO_* needed for module alias generation Commit `cfd280c912` ("net: sync some IP headers with glibc") changed a set of define's to an enum (with no explanation why) which introduced a bug in module mip6 where aliases are generated using the IPPROTO_* defines; mip6 doesn't load if require_module called with the aliases from xfrm_get_type(). Reverting this change back to define's to fix the aliases. modinfo mip6 (before this change) alias: xfrm-type-10-IPPROTO_DSTOPTS alias: xfrm-type-10-IPPROTO_ROUTING modinfo mip6 (after this change) alias: xfrm-type-10-43 alias: xfrm-type-10-60 Signed-off-by: Jan Moskyto Matejka <mq@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-06 21:18:06 -08:00
Shaohua Li	579f82901f	swap: add a simple detector for inappropriate swapin readahead This is a patch to improve swap readahead algorithm. It's from Hugh and I slightly changed it. Hugh's original changelog: swapin readahead does a blind readahead, whether or not the swapin is sequential. This may be ok on harddisk, because large reads have relatively small costs, and if the readahead pages are unneeded they can be reclaimed easily - though, what if their allocation forced reclaim of useful pages? But on SSD devices large reads are more expensive than small ones: if the readahead pages are unneeded, reading them in caused significant overhead. This patch adds very simplistic random read detection. Stealing the PageReadahead technique from Konstantin Khlebnikov's patch, avoiding the vma/anon_vma sophistications of Shaohua Li's patch, swapin_nr_pages() simply looks at readahead's current success rate, and narrows or widens its readahead window accordingly. There is little science to its heuristic: it's about as stupid as can be whilst remaining effective. The table below shows elapsed times (in centiseconds) when running a single repetitive swapping load across a 1000MB mapping in 900MB ram with 1GB swap (the harddisk tests had taken painfully too long when I used mem=500M, but SSD shows similar results for that). Vanilla is the 3.6-rc7 kernel on which I started; Shaohua denotes his Sep 3 patch in mmotm and linux-next; HughOld denotes my Oct 1 patch which Shaohua showed to be defective; HughNew this Nov 14 patch, with page_cluster as usual at default of 3 (8-page reads); HughPC4 this same patch with page_cluster 4 (16-page reads); HughPC0 with page_cluster 0 (1-page reads: no readahead). HDD for swapping to harddisk, SSD for swapping to VertexII SSD. Seq for sequential access to the mapping, cycling five times around; Rand for the same number of random touches. Anon for a MAP_PRIVATE anon mapping; Shmem for a MAP_SHARED anon mapping, equivalent to tmpfs. One weakness of Shaohua's vma/anon_vma approach was that it did not optimize Shmem: seen below. Konstantin's approach was perhaps mistuned, 50% slower on Seq: did not compete and is not shown below. HDD Vanilla Shaohua HughOld HughNew HughPC4 HughPC0 Seq Anon 73921 76210 75611 76904 78191 121542 Seq Shmem 73601 73176 73855 72947 74543 118322 Rand Anon 895392 831243 871569 845197 846496 841680 Rand Shmem 1058375 1053486 827935 764955 764376 756489 SSD Vanilla Shaohua HughOld HughNew HughPC4 HughPC0 Seq Anon 24634 24198 24673 25107 21614 70018 Seq Shmem 24959 24932 25052 25703 22030 69678 Rand Anon 43014 26146 28075 25989 26935 25901 Rand Shmem 45349 45215 28249 24268 24138 24332 These tests are, of course, two extremes of a very simple case: under heavier mixed loads I've not yet observed any consistent improvement or degradation, and wider testing would be welcome. Shaohua Li: Test shows Vanilla is slightly better in sequential workload than Hugh's patch. I observed with Hugh's patch sometimes the readahead size is shrinked too fast (from 8 to 1 immediately) in sequential workload if there is no hit. And in such case, continuing doing readahead is good actually. I don't prepare a sophisticated algorithm for the sequential workload because so far we can't guarantee sequential accessed pages are swap out sequentially. So I slightly change Hugh's heuristic - don't shrink readahead size too fast. Here is my test result (unit second, 3 runs average): Vanilla Hugh New Seq 356 370 360 Random 4525 2447 2444 Attached graph is the swapin/swapout throughput I collected with 'vmstat 2'. The first part is running a random workload (till around 1200 of the x-axis) and the second part is running a sequential workload. swapin and swapout throughput are almost identical in steady state in both workloads. These are expected behavior. while in Vanilla, swapin is much bigger than swapout especially in random workload (because wrong readahead). Original patches by: Shaohua Li and Konstantin Khlebnikov. [fengguang.wu@intel.com: swapin_nr_pages() can be static] Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Shaohua Li <shli@fusionio.com> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-02-06 13:48:51 -08:00
Rafael J. Wysocki	1f7c164b6f	ACPI / hotplug / PCI: Rework acpiphp_check_host_bridge() Since the only existing caller of acpiphp_check_host_bridge(), which is acpi_pci_root_scan_dependent(), already has a struct acpi_device pointer needed to obtain the ACPIPHP context, it doesn't make sense to execute acpi_bus_get_device() on its handle in acpiphp_handle_to_bridge() just in order to get that pointer back. For this reason, modify acpiphp_check_host_bridge() to take a struct acpi_device pointer as its argument and rearrange the code accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2014-02-06 17:31:52 +01:00
Rafael J. Wysocki	1a699476e2	ACPI / hotplug / PCI: Hotplug notifications from acpi_bus_notify() Since acpi_bus_notify() is executed on all notifications for all devices anyway, make it execute acpi_device_hotplug() for all hotplug events instead of installing notify handlers pointing to the same function for all hotplug devices. This change reduces both the size and complexity of ACPI-based device hotplug code. Moreover, since acpi_device_hotplug() only does significant things for devices that have either an ACPI scan handler, or a hotplug context with .eject() defined, and those devices had notify handlers pointing to acpi_hotplug_notify_cb() installed before anyway, this modification shouldn't change functionality. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-02-06 17:31:52 +01:00
Rafael J. Wysocki	5e6f236c26	ACPI / hotplug / PCI: Simplify acpi_install_hotplug_notify_handler() Since acpi_hotplug_notify_cb() does not use its data argument any more, the second argument of acpi_install_hotplug_notify_handler() can be dropped, so do that and update its callers accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-02-06 17:31:51 +01:00
Rafael J. Wysocki	3c2cc7ff9e	ACPI / hotplug / PCI: Consolidate ACPIPHP with ACPI core hotplug The ACPI-based PCI hotplug (ACPIPHP) code currently attaches its hotplug context objects directly to ACPI namespace nodes representing hotplug devices. However, after recent changes causing struct acpi_device to be created for every namespace node representing a device (regardless of its status), that is not necessary any more. Moreover, it's vulnerable to the theoretical issue that the ACPI handle passed in the context between handle_hotplug_event() and hotplug_event_work() may become invalid in the meantime (as a result of a concurrent table unload). In principle, this issue might be addressed by adding a non-empty release handler for ACPIPHP hotplug context objects analogous to acpi_scan_drop_device(), but that would duplicate the code in that function and in acpi_device_del_work_fn(). For this reason, it's better to modify ACPIPHP to attach its device hotplug contexts to struct device objects representing hotplug devices and make it use acpi_hotplug_notify_cb() as its notify handler. At the same time, acpi_device_hotplug() can be modified to dispatch the new .hp.event() callback pointing to acpiphp_hotplug_event() from ACPI device objects associated with PCI devices or use the generic ACPI device hotplug code for device objects with matching scan handlers. This allows the existing code duplication between ACPIPHP and the ACPI core to be reduced too and makes further ACPI-based device hotplug consolidation possible. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-02-06 17:31:37 +01:00
Steven Whitehouse	774016b2d4	GFS2: journal data writepages update GFS2 has carried what is more or less a copy of the write_cache_pages() for some time. It seems that this copy has slipped behind the core code over time. This patch brings it back uptodate, and in addition adds the tracepoint which would otherwise be missing. We could go further, and eliminate some or all of the code duplication here. The issue is that if we do that, then the function we need to split out from the existing write_cache_pages(), which will look a lot like gfs2_jdata_write_pagevec(), would land up putting quite a lot of extra variables on the stack. I know that has been a problem in the past in the writeback code path, which is why I've hesitated to do it here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2014-02-06 15:47:47 +00:00
James Hogan	00942d1a1b	[media] media: rc: add sysfs scancode filtering interface Add and document a generic sysfs based scancode filtering interface for making use of IR data matching hardware to filter out uninteresting scancodes. Two filters exist, one for normal operation and one for filtering scancodes which are permitted to wake the system from suspend. The following files are added to /sys/class/rc/rc?/: - filter: normal scancode filter value - filter_mask: normal scancode filter mask - wakeup_filter: wakeup scancode filter value - wakeup_filter_mask: wakeup scancode filter mask A new s_filter() driver callback is added which must arrange for the specified filter to be applied at the right time. Drivers can convert the scancode filter into a raw IR data filter, which can be applied immediately or later (for wake up filters). Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Mauro Carvalho Chehab <m.chehab@samsung.com> Cc: linux-media@vger.kernel.org Cc: Rob Landley <rob@landley.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>	2014-02-06 09:18:57 -02:00
Pablo Neira Ayuso	0165d9325d	netfilter: nf_tables: fix racy rule deletion We may lost race if we flush the rule-set (which happens asynchronously via call_rcu) and we try to remove the table (that userspace assumes to be empty). Fix this by recovering synchronous rule and chain deletion. This was introduced time ago before we had no batch support, and synchronous rule deletion performance was not good. Now that we have the batch support, we can just postpone the purge of old rule in a second step in the commit phase. All object deletions are synchronous after this patch. As a side effect, we save memory as we don't need rcu_head per rule anymore. Cc: Patrick McHardy <kaber@trash.net> Reported-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-02-06 11:46:06 +01:00
Pawel Moll	781f6d710d	gpio: generic: Add label to platform data When registering more than one platform device, it is useful to set the gpio chip label in the platform data. Signed-off-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2014-02-06 10:33:47 +01:00
Lars-Peter Clausen	a3485d0885	gpio: consumer.h: Move forward declarations outside #ifdef Make sure that the forward declared structs in gpio/consumer.h are also visible on the else branch of the CONFIG_GPIOLIB #ifdef. Fixes the following warnings and their associated errors when CONFIG_GPIOLIB is not selected: include/linux/gpio/consumer.h:67:14: warning: 'struct device' declared inside parameter list include/linux/gpio/consumer.h:67:14: warning: its scope is only this definition or declaration, which is probably not what you want [...] Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2014-02-06 10:22:56 +01:00
Patrick McHardy	05513e9e33	netfilter: nf_tables: add reject module for NFPROTO_INET Add a reject module for NFPROTO_INET. It does nothing but dispatch to the AF-specific modules based on the hook family. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-02-06 09:44:18 +01:00
Patrick McHardy	cc4723ca31	netfilter: nft_reject: split up reject module into IPv4 and IPv6 specifc parts Currently the nft_reject module depends on symbols from ipv6. This is wrong since no generic module should force IPv6 support to be loaded. Split up the module into AF-specific and a generic part. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-02-06 09:44:10 +01:00
Emmanuel Grumbach	63c361f511	mac80211: propagate STBC / LDPC flags to radiotap This capabilities weren't propagated to the radiotap header. We don't set here the VHT_KNOWN / MCS_HAVE flag because not all the low level drivers will know how to properly flag the frames, hence the low level driver will be in charge of setting IEEE80211_RADIOTAP_MCS_HAVE_FEC, IEEE80211_RADIOTAP_MCS_HAVE_STBC and / or IEEE80211_RADIOTAP_VHT_KNOWN_STBC according to its capabilities. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-02-06 09:34:58 +01:00
Emmanuel Grumbach	1b8d242adb	mac80211: move VHT related RX_FLAG to another variable ieee80211_rx_status.flags is full. Define a new vht_flag variable to be able to set more VHT related flags and make room in flags. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Acked-by: Kalle Valo <kvalo@qca.qualcomm.com> [ath10k] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-02-06 09:34:10 +01:00
Emmanuel Grumbach	0059b2b142	mac80211: remove unused radiotap vendor fields in ieee80211_rx_status The purpose of this housekeeping is to make some room for VHT flags. The radiotap vendor fields weren't in use. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2014-02-06 09:33:46 +01:00
Linus Torvalds	1cd731df09	Merge tag 'stable/for-linus-3.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen fixes from Konrad Rzeszutek Wilk: "Bug-fixes: - Revert "xen/grant-table: Avoid m2p_override during mapping" as it broke Xen ARM build. - Fix CR4 not being set on AP processors in Xen PVH mode" * tag 'stable/for-linus-3.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvh: set CR4 flags for APs Revert "xen/grant-table: Avoid m2p_override during mapping"	2014-02-05 16:01:11 -08:00
Linus Torvalds	8352650a5c	Merge git://git.infradead.org/users/willy/linux-nvme Pull NVMe driver update from Matthew Wilcox: "Looks like I missed the merge window ... but these are almost all bugfixes anyway (the ones that aren't have been baking for months)" * git://git.infradead.org/users/willy/linux-nvme: NVMe: Namespace use after free on surprise removal NVMe: Correct uses of INIT_WORK NVMe: Include device and queue numbers in interrupt name NVMe: Add a pci_driver shutdown method NVMe: Disable admin queue on init failure NVMe: Dynamically allocate partition numbers NVMe: Async IO queue deletion NVMe: Surprise removal handling NVMe: Abort timed out commands NVMe: Schedule reset for failed controllers NVMe: Device resume error handling NVMe: Cache dev->pci_dev in a local pointer NVMe: Fix lockdep warnings NVMe: compat SG_IO ioctl NVMe: remove deprecated IRQF_DISABLED NVMe: Avoid shift operation when writing cq head doorbell	2014-02-05 15:53:26 -08:00
Patrick McHardy	64d46806b6	netfilter: nf_tables: add AF specific expression support For the reject module, we need to add AF-specific implementations to get rid of incorrect module dependencies. Try to load an AF-specific module first and fall back to generic modules. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-02-06 00:05:36 +01:00
Linus Torvalds	c4ad8f98be	execve: use 'struct filename *' for executable name passing This changes 'do_execve()' to get the executable name as a 'struct filename', and to free it when it is done. This is what the normal users want, and it simplifies and streamlines their error handling. The controlled lifetime of the executable name also fixes a use-after-free problem with the trace_sched_process_exec tracepoint: the lifetime of the passed-in string for kernel users was not at all obvious, and the user-mode helper code used UMH_WAIT_EXEC to serialize the pathname allocation lifetime with the execve() having finished, which in turn meant that the trace point that happened after mm_release() of the old process VM ended up using already free'd memory. To solve the kernel string lifetime issue, this simply introduces "getname_kernel()" that works like the normal user-space getname() function, except with the source coming from kernel memory. As Oleg points out, this also means that we could drop the tcomm[] array from 'struct linux_binprm', since the pathname lifetime now covers setup_new_exec(). That would be a separate cleanup. Reported-by: Igor Zhbanov <i.zhbanov@samsung.com> Tested-by: Steven Rostedt <rostedt@goodmis.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-02-05 12:54:53 -08:00
Pablo Neira Ayuso	e53376bef2	netfilter: nf_conntrack: don't release a conntrack with non-zero refcnt With this patch, the conntrack refcount is initially set to zero and it is bumped once it is added to any of the list, so we fulfill Eric's golden rule which is that all released objects always have a refcount that equals zero. Andrey Vagin reports that nf_conntrack_free can't be called for a conntrack with non-zero ref-counter, because it can race with nf_conntrack_find_get(). A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero ref-counter says that this conntrack is used. So when we release a conntrack with non-zero counter, we break this assumption. CPU1 CPU2 ____nf_conntrack_find() nf_ct_put() destroy_conntrack() ... init_conntrack __nf_conntrack_alloc (set use = 1) atomic_inc_not_zero(&ct->use) (use = 2) if (!l4proto->new(ct, skb, dataoff, timeouts)) nf_conntrack_free(ct); (use = 2 !!!) ... __nf_conntrack_alloc (set use = 1) if (!nf_ct_key_equal(h, tuple, zone)) nf_ct_put(ct); (use = 0) destroy_conntrack() /* continue to work with CT */ After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get" another bug was triggered in destroy_conntrack(): <4>[67096.759334] ------------[ cut here ]------------ <2>[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211! ... <4>[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G C --------------- 2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB <4>[67096.759932] RIP: 0010:[<ffffffffa03d99ac>] [<ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack] <4>[67096.760255] Call Trace: <4>[67096.760255] [<ffffffff814844a7>] nf_conntrack_destroy+0x17/0x30 <4>[67096.760255] [<ffffffffa03d9bb5>] nf_conntrack_find_get+0x85/0x130 [nf_conntrack] <4>[67096.760255] [<ffffffffa03d9fb2>] nf_conntrack_in+0x352/0xb60 [nf_conntrack] <4>[67096.760255] [<ffffffffa048c771>] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4] <4>[67096.760255] [<ffffffff81484419>] nf_iterate+0x69/0xb0 <4>[67096.760255] [<ffffffff814b5b00>] ? dst_output+0x0/0x20 <4>[67096.760255] [<ffffffff814845d4>] nf_hook_slow+0x74/0x110 <4>[67096.760255] [<ffffffff814b5b00>] ? dst_output+0x0/0x20 <4>[67096.760255] [<ffffffff814b66d5>] raw_sendmsg+0x775/0x910 <4>[67096.760255] [<ffffffff8104c5a8>] ? flush_tlb_others_ipi+0x128/0x130 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff814c136a>] inet_sendmsg+0x4a/0xb0 <4>[67096.760255] [<ffffffff81444e93>] ? sock_sendmsg+0x13/0x140 <4>[67096.760255] [<ffffffff81444f97>] sock_sendmsg+0x117/0x140 <4>[67096.760255] [<ffffffff8102e299>] ? native_smp_send_reschedule+0x49/0x60 <4>[67096.760255] [<ffffffff81519beb>] ? _spin_unlock_bh+0x1b/0x20 <4>[67096.760255] [<ffffffff8109d930>] ? autoremove_wake_function+0x0/0x40 <4>[67096.760255] [<ffffffff814960f0>] ? do_ip_setsockopt+0x90/0xd80 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff814457c9>] sys_sendto+0x139/0x190 <4>[67096.760255] [<ffffffff810efa77>] ? audit_syscall_entry+0x1d7/0x200 <4>[67096.760255] [<ffffffff810ef7c5>] ? __audit_syscall_exit+0x265/0x290 <4>[67096.760255] [<ffffffff81474daf>] compat_sys_socketcall+0x13f/0x210 <4>[67096.760255] [<ffffffff8104dea3>] ia32_sysret+0x0/0x5 I have reused the original title for the RFC patch that Andrey posted and most of the original patch description. Cc: Eric Dumazet <edumazet@google.com> Cc: Andrew Vagin <avagin@parallels.com> Cc: Florian Westphal <fw@strlen.de> Reported-by: Andrew Vagin <avagin@parallels.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Andrew Vagin <avagin@parallels.com>	2014-02-05 17:46:06 +01:00
Rafael J. Wysocki	e525506fcb	ACPI / hotplug / PCI: Define hotplug context lock in the core Subsequent changes will require the ACPI core to acquire the lock protecting the ACPIPHP hotplug contexts, so move the definition of the lock to the core and change its name to be more generic. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2014-02-05 17:41:26 +01:00
Rafael J. Wysocki	78ea4639a7	ACPI / hotplug: Fix potential race in acpi_bus_notify() There is a slight possibility for the ACPI device object pointed to by adev in acpi_hotplug_notify_cb() to become invalid between the acpi_bus_get_device() that it comes from and the subsequent dereference of that pointer under get_device(). Namely, if acpi_scan_drop_device() runs in parallel with acpi_hotplug_notify_cb(), acpi_device_del_work_fn() queued up by it may delete the device object in question right after a successful execution of acpi_bus_get_device() in acpi_bus_notify(). An analogous problem is present in acpi_bus_notify() where the device pointer coming from acpi_bus_get_device() may become invalid before it subsequent dereference in the "if" block. To prevent that from happening, introduce a new function, acpi_bus_get_acpi_device(), working analogously to acpi_bus_get_device() except that it will grab a reference to the ACPI device object returned by it and it will do that under the ACPICA's namespace mutex. Then, make both acpi_hotplug_notify_cb() and acpi_bus_notify() use acpi_bus_get_acpi_device() instead of acpi_bus_get_device() so as to ensure that the pointers used by them will not become stale at one point. In addition to that, introduce acpi_bus_put_acpi_device() as a wrapper around put_device() to be used along with acpi_bus_get_acpi_device() and make the (new) users of the latter use acpi_bus_put_acpi_device() too. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2014-02-05 17:41:18 +01:00
Rafael J. Wysocki	7c2e17714e	ACPICA: Introduce acpi_get_data_full() and rework acpi_get_data() Introduce a new function, acpi_get_data_full(), working in analogy with acpi_get_data() except that it can execute a callback provided as its 4th argument right after acpi_ns_get_attached_data() has returned a success. That will allow Linux to reference count the object pointed to by *data before the namespace mutex is released so as to ensure that it will not be freed going forward until the reference to it acquired by acpi_get_data_full() is dropped. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>	2014-02-05 17:41:16 +01:00
Geert Uytterhoeven	1db73ae39a	of/device: Nullify match table in of_match_device() for CONFIG_OF=n If the of_device_id table inside a device driver is protected by #ifdef CONFIG_OF, the driver still has to provide a dummy declaration of the table, or wrap it inside of_match_ptr(), when calling of_match_device() in the CONFIG_OF=n case, else the driver fails to compile with e.g. drivers/spi/spi-rspi.c: In function 'rspi_probe': drivers/spi/spi-rspi.c:1203:26: error: 'rspi_of_match' undeclared (first use in this function) drivers/spi/spi-rspi.c:1203:26: note: each undeclared identifier is reported only once for each function it appears in Make of_match_device() nullify the table pointer if CONFIG_OF=n to fix this. Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org> Signed-off-by: Rob Herring <robh@kernel.org>	2014-02-05 10:04:37 -06:00
Rob Herring	662372e42e	of: restructure for_each macros to fix compile warnings Commit `00b2c76a6a` "include/linux/of.h: make for_each_child_of_node() reference its args when CONFIG_OF=n" fixed warnings for unused variables, but introduced variable "used uninitialized" warnings. Simply initializing the variables would result in "set but not used" warnings with W=1. Fix both types of warnings by making all the for_each macros unconditional and rely on the dummy static inline functions to initialize and reference any variables. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Grant Likely <grant.likely@linaro.org>	2014-02-05 09:51:54 -06:00

... 113 114 115 116 117 ...

70103 Commits