linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-24 13:11:40 +00:00

Author	SHA1	Message	Date
Paolo Valente	3c395a969a	null_blk: set a separate timer for each command For the Timer IRQ mode (i.e., when command completions are delayed), there is one timer for each CPU. Each of these timers . has a completion queue associated with it, containing all the command completions to be executed when the timer fires; . is set, and a new completion-to-execute is inserted into its completion queue, every time the dispatch code for a new command happens to be executed on the CPU related to the timer. This implies that, if the dispatch of a new command happens to be executed on a CPU whose timer has already been set, but has not yet fired, then the timer is set again, to the completion time of the newly arrived command. When the timer eventually fires, all its queued completions are executed. This way of handling delayed command completions entails the following problem: if more than one command completion is inserted into the queue of a timer before the timer fires, then the expiration time for the timer is moved forward every time each of these completions is enqueued. As a consequence, only the last completion enqueued enjoys a correct execution time, while all previous completions are unjustly delayed until the last completion is executed (and at that time they are executed all together). Specifically, if all the above completions are enqueued almost at the same time, then the problem is negligible. On the opposite end, if every completion is enqueued a while after the previous completion was enqueued (in the extreme case, it is enqueued only right before the timer would have expired), then every enqueued completion, except for the last one, experiences an inflated delay, proportional to the number of completions enqueued after it. In the end, commands, and thus I/O requests, may be completed at an arbitrarily lower rate than the desired one. This commit addresses this issue by replacing per-CPU timers with per-command timers, i.e., by associating an individual timer with each command. Signed-off-by: Paolo Valente <paolo.valente@unimore.it> Signed-off-by: Arianna Avanzini <avanzini@google.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-12-01 10:52:08 -07:00
Ming Lei	a88d32af18	blk-merge: fix computing bio->bi_seg_front_size in case of single segment When bio has only one physical segment, we should set bio's bi_seg_front_size as the real(final) size of the single segment. Fixes: 02e707424c2ea(blk-merge: fix blk_bio_segment_split) Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-30 13:02:36 -07:00
Jan Kara	74cedf9b6c	direct-io: Fix negative return from dio read beyond eof Assume a filesystem with 4KB blocks. When a file has size 1000 bytes and we issue direct IO read at offset 1024, blockdev_direct_IO() reads the tail of the last block and the logic for handling short DIO reads in dio_complete() results in a return value -24 (1000 - 1024) which obviously confuses userspace. Fix the problem by bailing out early once we sample i_size and can reliably check that direct IO read starts beyond i_size. Reported-by: Avi Kivity <avi@scylladb.com> Fixes: `9fe55eea7e` CC: stable@vger.kernel.org CC: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-30 10:15:42 -07:00
Hannes Reinecke	bf4e6b4e75	block: Always check queue limits for cloned requests When a cloned request is retried on other queues it always needs to be checked against the queue limits of that queue. Otherwise the calculations for nr_phys_segments might be wrong, leading to a crash in scsi_init_sgtable(). To clarify this the patch renames blk_rq_check_limits() to blk_cloned_rq_check_limits() and removes the symbol export, as the new function should only be used for cloned requests and never exported. Cc: Mike Snitzer <snitzer@redhat.com> Cc: Ewan Milne <emilne@redhat.com> Cc: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Fixes: `e2a60da74` ("block: Clean up special command handling logic") Cc: stable@vger.kernel.org # 3.7+ Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-29 14:37:27 -07:00
Wenwei Tao	d0a712ceb8	lightnvm: missing nvm_lock acquire To avoid race conditions, traverse dev, media manager, and target lists and also register, unregister entries to/from them, should be always under the nvm_lock control. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-29 14:34:58 -07:00
Matias Bjørling	08236c6bb2	lightnvm: unconverted ppa returned in get_bb_tbl The get_bb_tbl function takes ppa as a generic address, which is converted to the ppa device address within the device driver. When the update_bbtbl callback is called from get_bb_tbl, the device specific ppa is used, instead of the generic ppa. Make sure to pass the generic ppa. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-29 14:34:58 -07:00
Matias Bjørling	09f2e71609	lightnvm: refactor and change vendor id for qemu The QEMU NVMe implementation uses Intel vendor, Intel device id, and the first vendor specific byte to identify a LightNVM compatible nvme instance. Instead of using the Intel specific, use a preallocated from CNEX Labs instead. This lets us uniquely identify a QEMU lightnvm device without breaking other vendor specific work in the qemu device driver. Reported-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-29 14:34:58 -07:00
Wenwei Tao	d160147b5c	lightnvm: do device max sectors boundary check first do device max_phys_sect boundary check first, otherwise we will allocate dma_pools for devices whose max sectors are beyond lightnvm support and register them. Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-29 14:34:58 -07:00
Sudip Mukherjee	76e25081b6	lightnvm: fix ioctl memory leaks If copy_to_user() fails we returned error but we missed releasing devices. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-29 14:34:57 -07:00
Wenwei Tao	8261bd48c6	lightnvm: free memory when gennvm register fails free allocated nvm block and gennvm lun structures when gennvm register fails, otherwise it will cause memory leak. Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-29 14:34:57 -07:00
Keith Busch	c4699e70d1	lightnvm: Simplify config when disabled We shouldn't compile an object file to get empty implementations; conforms to linux coding style on conditional compilation. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Matias Bjørling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-29 14:34:57 -07:00
Eric Sandeen	77032ca66f	Return EBUSY from BLKRRPART for mounted whole-dev fs Today, blockdev --rereadpt /dev/sda will fail with EBUSY if any partition of sda is mounted (and will fail with EINVAL if pointed at a partition). But it will pass if the entire block device is formatted with a filesystem and mounted. I don't think this makes sense; partitioning should surely not ever change out from under a mounted device. So check for bdev->bd_super, and fail that with -EBUSY as well. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-25 20:49:24 -07:00
Linus Torvalds	78c4a49a69	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "A couple of fixes for sendfile lockups caught by Dmitry + a fix for ancient sysvfs symlink breakage" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Avoid softlockups with sendfile(2) vfs: Make sendfile(2) killable even better fix sysvfs symlinks	2015-11-25 15:11:08 -08:00
Linus Torvalds	9b81d512a4	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull more block layer fixes from Jens Axboe: "I wasn't going to send off a new pull before next week, but the blk flush fix from Jan from the other day introduced a regression. It's rare enough not to have hit during testing, since it requires both a device that rejects the first flush, and bad timing while it does that. But since someone did hit it, let's get the revert into 4.4-rc3 so we don't have a released rc with that known issue. Apart from that revert, three other fixes: - From Christoph, a fix for a missing unmap in NVMe request preparation. - An NVMe fix from Nishanth that fixes data corruption on powerpc. - Also from Christoph, fix a list_del() attempt on blk-mq that didn't have a matching list_add() at timer start" * 'for-linus' of git://git.kernel.dk/linux-block: Revert "blk-flush: Queue through IO scheduler when flush not required" block: fix blk_abort_request for blk-mq drivers nvme: add missing unmaps in nvme_queue_rq NVMe: default to 4k device page size	2015-11-25 11:08:35 -08:00
Jens Axboe	dcd8376c36	Revert "blk-flush: Queue through IO scheduler when flush not required" This reverts commit `1b2ff19e6a`. Jan writes: -- Thanks for report! After some investigation I found out we allocate elevator specific data in __get_request() only for non-flush requests. And this is actually required since the flush machinery uses the space in struct request for something else. Doh. So my patch is just wrong and not easy to fix since at the time __get_request() is called we are not sure whether the flush machinery will be used in the end. Jens, please revert `1b2ff19e6a`. Thanks! I'm somewhat surprised that you can reliably hit the race where flushing gets disabled for the device just while the request is in flight. But I guess during boot it makes some sense. -- So let's just revert it, we can fix the queue run manually after the fact. This race is rare enough that it didn't trigger in testing, it requires the specific disable-while-in-flight scenario to trigger.	2015-11-25 10:12:54 -07:00
Linus Torvalds	4cf193b4b2	Bug fixes for all architectures. Nothing really stands out. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJWVcuuAAoJEL/70l94x66DmUEIAKnU6SCojoFOxWY0/EH/PBue m53mjiRiHp+YH/74dW0XF843+IKLfbLiADRaWHTqc9VW0ifnXmRjOv/bYpC7I/+R 8XKHaJZQfpb6yvICEqWvMItBpddoakbhv8DJOf4bUfipNY0zx5F2STFfx0KICtbc mHTB4y5bFgIz8mJBLX+Dmh/UyXL0kbjSnksu0WA80Szr0pq2Sr4Csrx8PqGAEfIJ e5DUW0h3UXY77J5fQbpgJs93hzp1YwkuRKEeYpB8POx4fmvssHoybmOk46sP0Ipb IYxrJ+CUQ4o6Vpp3LTMjzMfJ4Y/NaOHCvYxL0okhxtUuq+UbUZjM5ziclfQ/32M= =cbxQ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "Bug fixes for all architectures. Nothing really stands out" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (21 commits) KVM: nVMX: remove incorrect vpid check in nested invvpid emulation arm64: kvm: report original PAR_EL1 upon panic arm64: kvm: avoid %p in __kvm_hyp_panic KVM: arm/arm64: vgic: Trust the LR state for HW IRQs KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active KVM: arm/arm64: Fix preemptible timer active state crazyness arm64: KVM: Add workaround for Cortex-A57 erratum 834220 arm64: KVM: Fix AArch32 to AArch64 register mapping ARM/arm64: KVM: test properly for a PTE's uncachedness KVM: s390: fix wrong lookup of VCPUs by array index KVM: s390: avoid memory overwrites on emergency signal injection KVM: Provide function for VCPU lookup by id KVM: s390: fix pfmf intercept handler KVM: s390: enable SIMD only when no VCPUs were created KVM: x86: request interrupt window when IRQ chip is split KVM: x86: set KVM_REQ_EVENT on local interrupt request from user space KVM: x86: split kvm_vcpu_ready_for_interrupt_injection out of dm_request_for_irq_injection KVM: x86: fix interrupt window handling in split IRQ chip case MIPS: KVM: Uninit VCPU in vcpu_create error path MIPS: KVM: Fix CACHE immediate offset sign extension ...	2015-11-25 09:01:49 -08:00
Haozhong Zhang	b2467e744f	KVM: nVMX: remove incorrect vpid check in nested invvpid emulation This patch removes the vpid check when emulating nested invvpid instruction of type all-contexts invalidation. The existing code is incorrect because: (1) According to Intel SDM Vol 3, Section "INVVPID - Invalidate Translations Based on VPID", invvpid instruction does not check vpid in the invvpid descriptor when its type is all-contexts invalidation. (2) According to the same document, invvpid of type all-contexts invalidation does not require there is an active VMCS, so/and get_vmcs12() in the existing code may result in a NULL-pointer dereference. In practice, it can crash both KVM itself and L1 hypervisors that use invvpid (e.g. Xen). Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-11-25 15:52:55 +01:00
Christoph Hellwig	55ce0da1da	block: fix blk_abort_request for blk-mq drivers We only added the request to the request list for the !blk-mq case, so we should only delete it in that case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-24 15:24:10 -07:00
Christoph Hellwig	bf508e910b	nvme: add missing unmaps in nvme_queue_rq When we fail various metadata related operations in nvme_queue_rq we need to unmap the data SGL. Cc: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-24 15:24:05 -07:00
Nishanth Aravamudan	c5c9f25b98	NVMe: default to 4k device page size We received a bug report recently when DDW (64-bit direct DMA on Power) is not enabled for NVMe devices. In that case, we fall back to 32-bit DMA via the IOMMU, which is always done via 4K TCEs (Translation Control Entries). The NVMe device driver, though, assumes that the DMA alignment for the PRP entries will match the device's page size, and that the DMA aligment matches the kernel's page aligment. On Power, the the IOMMU page size, as mentioned above, can be 4K, while the device can have a page size of 8K, while the kernel has a page size of 64K. This eventually trips the BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple of 4K but not 8K (e.g., 0xF000). In this particular case of page sizes, we clearly want to use the IOMMU's page size in the driver. And generally, the NVMe driver in this function should be using the IOMMU's page size for the default device page size, rather than the kernel's page size. There is not currently an API to obtain the IOMMU's page size across all architectures and in the interest of a stop-gap fix to this functional issue, default the NVMe device page size to 4K, with the intent of adding such an API and implementation across all architectures in the next merge window. With the functionally equivalent v3 of this patch, our hardware test exerciser survives when using 32-bit DMA; without the patch, the kernel will BUG within a few minutes. Signed-off-by: Nishanth Aravamudan <nacc at linux.vnet.ibm.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-24 15:05:51 -07:00
Linus Torvalds	6ffeba9607	Two fixes for 4.4-rc1's DM ioctl changes that introduced the potential for infinite recursion on ioctl (with DM multipath). And four stable fixes: - A DM thin-provisioning fix to restore 'error_if_no_space' setting when a thin-pool is made writable again (after having been out of space). - A DM thin-provisioning fix to properly advertise discard support for thin volumes that are stacked on a thin-pool whose underlying data device doesn't support discards. - A DM ioctl fix to allow ctrl-c to break out of an ioctl retry loop when DM multipath is configured to 'queue_if_no_path'. - A DM crypt fix for a possible hang on dm-crypt device removal. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWVMgcAAoJEMUj8QotnQNawbMIAK9SqJoLWNVmxMFbUMr8LkWM 8TPfdt/TiwXE2FmiBs3Yhj0GuQsdcihGvfjrS7sIbP8XNpsjtiKVHIWwjAk0UFKm dS3R6yYZWySeaG1tLmQcuvmGf5JUQJ7hufyqAgkVdQ9bqeORJ4dJFx24Rjciz73s UbyhxOeu1bJC3ECtiiUcPpUrlwhsOir4i1HWUQcQjo2r4y0vYUY3lndDdI6w9tis K7M/V3YNkw5WMupg+VPE9hJdp97zHKHNNMQgb8Q5Z0Y/PnrZ7uVEr9yyrgpKBE+P u3GUUIqH+N7CZxJDld0LTJ+GcK+gIgnzJqAtB3EWSeONHkBAuAR+Pvqh/qqYhsI= =8WtB -----END PGP SIGNATURE----- Merge tag 'dm-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: "Two fixes for 4.4-rc1's DM ioctl changes that introduced the potential for infinite recursion on ioctl (with DM multipath). And four stable fixes: - A DM thin-provisioning fix to restore 'error_if_no_space' setting when a thin-pool is made writable again (after having been out of space). - A DM thin-provisioning fix to properly advertise discard support for thin volumes that are stacked on a thin-pool whose underlying data device doesn't support discards. - A DM ioctl fix to allow ctrl-c to break out of an ioctl retry loop when DM multipath is configured to 'queue_if_no_path'. - A DM crypt fix for a possible hang on dm-crypt device removal" * tag 'dm-4.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm thin: fix regression in advertised discard limits dm crypt: fix a possible hang due to race condition on exit dm mpath: fix infinite recursion in ioctl when no paths and !queue_if_no_path dm: do not reuse dm_blk_ioctl block_device input as local variable dm: fix ioctl retry termination with signal dm thin: restore requested 'error_if_no_space' setting on OODS to WRITE transition	2015-11-24 12:53:11 -08:00
Eric Dumazet	81b1a832d7	pidns: fix NULL dereference in __task_pid_nr_ns() I got a crash during a "perf top" session that was caused by a race in __task_pid_nr_ns() : pid_nr_ns() was inlined, but apparently compiler chose to read task->pids[type].pid twice, and the pid->level dereference crashed because we got a NULL pointer at the second read : if (pid && ns->level <= pid->level) { // CRASH Just use RCU API properly to solve this race, and not worry about "perf top" crashing hosts :( get_task_pid() can benefit from same fix. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-24 12:03:55 -08:00
Paolo Bonzini	8bd142c016	KVM/ARM Fixes for v4.4-rc3. Includes some timer fixes, properly unmapping PTEs, an errata fix, and two tweaks to the EL2 panic code. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWVJ7gAAoJEEtpOizt6ddyD5MH/3M/nhtZTnT6v0RPDvHWJo7s 5BQmITJYPHFkTO14OHWTVLXiGgLws8gPZnWHxC4jjHjpuJnL+/MM551FpCOqDDd7 vweYgVlSqD8ANH5nKbv1PPnzjrqhTVN+yi3ZItXy2pxsfvu63FC6Z43B2axelLvw XYmHoMZaeWBBw2gHi3djGfju3Yj/2SOe+ozuvAXpxA5+NhSiPHHnMefGy5k3wKnJ sETwshPdjiMeK4ItfMhveFTDRjl4uh9uQyORfaa5gqG0uePt3EalYynw+gEjZ6RX Bpc3nLwboIfRIa/WwyoHm+nmLIUYjU8dAgLwUOIbdeG0igpdALdvsB0aBHCgngk= =+7ED -----END PGP SIGNATURE----- Merge tag 'kvm-arm-for-v4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/ARM Fixes for v4.4-rc3. Includes some timer fixes, properly unmapping PTEs, an errata fix, and two tweaks to the EL2 panic code.	2015-11-24 19:34:40 +01:00
Linus Torvalds	4ce01c518e	Merge branch 'for-linus' of git://git.kernel.dk/linux-block Pull block layer fixes from Jens Axboe: "A round of fixes/updates for the current series. This looks a little bigger than it is, but that's mainly because we pushed the lightnvm enabled null_blk change out of the merge window so it could be updated a bit. The rest of the volume is also mostly lightnvm. In particular: - Lightnvm. Various fixes, additions, updates from Matias and Javier, as well as from Wenwei Tao. - NVMe: - Fix for potential arithmetic overflow from Keith. - Also from Keith, ensure that we reap pending completions from a completion queue before deleting it. Fixes kernel crashes when resetting a device with IO pending. - Various little lightnvm related tweaks from Matias. - Fixup flushes to go through the IO scheduler, for the cases where a flush is not required. Fixes a case in CFQ where we would be idling and not see this request, hence not break the idling. From Jan Kara. - Use list_{first,prev,next} in elevator.c for cleaner code. From Gelian Tang. - Fix for a warning trigger on btrfs and raid on single queue blk-mq devices, where we would flush plug callbacks with preemption disabled. From me. - A mac partition validation fix from Kees Cook. - Two merge fixes from Ming, marked stable. A third part is adding a new warning so we'll notice this quicker in the future, if we screw up the accounting. - Cleanup of thread name/creation in mtip32xx from Rasmus Villemoes" * 'for-linus' of git://git.kernel.dk/linux-block: (32 commits) blk-merge: warn if figured out segment number is bigger than nr_phys_segments blk-merge: fix blk_bio_segment_split block: fix segment split blk-mq: fix calling unplug callbacks with preempt disabled mac: validate mac_partition is within sector mtip32xx: use formatting capability of kthread_create_on_node NVMe: reap completion entries when deleting queue lightnvm: add free and bad lun info to show luns lightnvm: keep track of block counts nvme: lightnvm: use admin queues for admin cmds lightnvm: missing free on init error lightnvm: wrong return value and redundant free null_blk: do not del gendisk with lightnvm null_blk: use device addressing mode null_blk: use ppa_cache pool NVMe: Fix possible arithmetic overflow for max segments blk-flush: Queue through IO scheduler when flush not required null_blk: register as a LightNVM device elevator: use list_{first,prev,next}_entry lightnvm: cleanup queue before target removal ...	2015-11-24 10:26:30 -08:00
Mark Rutland	fbb4574ce9	arm64: kvm: report original PAR_EL1 upon panic If we call __kvm_hyp_panic while a guest context is active, we call __restore_sysregs before acquiring the system register values for the panic, in the process throwing away the PAR_EL1 value at the point of the panic. This patch modifies __kvm_hyp_panic to stash the PAR_EL1 value prior to restoring host register values, enabling us to report the original values at the point of the panic. Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-11-24 18:20:58 +01:00
Mark Rutland	1d7a4e313a	arm64: kvm: avoid %p in __kvm_hyp_panic Currently __kvm_hyp_panic uses %p for values which are not pointers, such as the ESR value. This can confusingly lead to "(null)" being printed for the value. Use %x instead, and only use %p for host pointers. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-11-24 18:18:13 +01:00
Christoffer Dall	9f958c11b7	KVM: arm/arm64: vgic: Trust the LR state for HW IRQs We were probing the physial distributor state for the active state of a HW virtual IRQ, because we had seen evidence that the LR state was not cleared when the guest deactivated a virtual interrupted. However, this issue turned out to be a software bug in the GIC, which was solved by: 84aab5e68c2a5e1e18d81ae8308c3ce25d501b29 (KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active, 2015-11-24) Therefore, get rid of the complexities and just look at the LR. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-11-24 18:08:37 +01:00
Christoffer Dall	0e3dfda91d	KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active We were incorrectly removing the active state from the physical distributor on the timer interrupt when the timer output level was deasserted. We shouldn't be doing this without considering the virtual interrupt's active state, because the architecture requires that when an LR has the HW bit set and the pending or active bits set, then the physical interrupt must also have the corresponding bits set. This addresses an issue where we have been observing an inconsistency between the LR state and the physical distributor state where the LR state was active and the physical distributor was not active, which shouldn't happen. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-11-24 18:07:40 +01:00
Christoffer Dall	7e16aa81f9	KVM: arm/arm64: Fix preemptible timer active state crazyness We were setting the physical active state on the GIC distributor in a preemptible section, which could cause us to set the active state on different physical CPU from the one we were actually going to run on, hacoc ensues. Since we are no longer descheduling/scheduling soft timers in the flush/sync timer functions, simply moving the timer flush into a non-preemptible section. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-11-24 18:04:00 +01:00
Marc Zyngier	498cd5c32b	arm64: KVM: Add workaround for Cortex-A57 erratum 834220 Cortex-A57 parts up to r1p2 can misreport Stage 2 translation faults when a Stage 1 permission fault or device alignment fault should have been reported. This patch implements the workaround (which is to validate that the Stage-1 translation actually succeeds) by using code patching. Cc: stable@vger.kernel.org Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-11-24 17:58:14 +01:00
Marc Zyngier	c0f0963464	arm64: KVM: Fix AArch32 to AArch64 register mapping When running a 32bit guest under a 64bit hypervisor, the ARMv8 architecture defines a mapping of the 32bit registers in the 64bit space. This includes banked registers that are being demultiplexed over the 64bit ones. On exceptions caused by an operation involving a 32bit register, the HW exposes the register number in the ESR_EL2 register. It was so far understood that SW had to distinguish between AArch32 and AArch64 accesses (based on the current AArch32 mode and register number). It turns out that I misinterpreted the ARM ARM, and the clue is in D1.20.1: "For some exceptions, the exception syndrome given in the ESR_ELx identifies one or more register numbers from the issued instruction that generated the exception. Where the exception is taken from an Exception level using AArch32 these register numbers give the AArch64 view of the register." Which means that the HW is already giving us the translated version, and that we shouldn't try to interpret it at all (for example, doing an MMIO operation from the IRQ mode using the LR register leads to very unexpected behaviours). The fix is thus not to perform a call to vcpu_reg32() at all from vcpu_reg(), and use whatever register number is supplied directly. The only case we need to find out about the mapping is when we actively generate a register access, which only occurs when injecting a fault in a guest. Cc: stable@vger.kernel.org Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-11-24 17:58:08 +01:00
Ard Biesheuvel	e6fab54423	ARM/arm64: KVM: test properly for a PTE's uncachedness The open coded tests for checking whether a PTE maps a page as uncached use a flawed '(pte_val(xxx) & CONST) != CONST' pattern, which is not guaranteed to work since the type of a mapping is not a set of mutually exclusive bits For HYP mappings, the type is an index into the MAIR table (i.e, the index itself does not contain any information whatsoever about the type of the mapping), and for stage-2 mappings it is a bit field where normal memory and device types are defined as follows: #define MT_S2_NORMAL 0xf #define MT_S2_DEVICE_nGnRE 0x1 I.e., masking and comparing with the latter matches on the former, and we have been getting lucky merely because the S2 device mappings also have the PTE_UXN bit set, or we would misidentify memory mappings as device mappings. Since the unmap_range() code path (which contains one instance of the flawed test) is used both for HYP mappings and stage-2 mappings, and considering the difference between the two, it is non-trivial to fix this by rewriting the tests in place, as it would involve passing down the type of mapping through all the functions. However, since HYP mappings and stage-2 mappings both deal with host physical addresses, we can simply check whether the mapping is backed by memory that is managed by the host kernel, and only perform the D-cache maintenance if this is the case. Cc: stable@vger.kernel.org Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Pavel Fedin <p.fedin@samsung.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2015-11-24 17:58:00 +01:00
Ming Lei	12e57f59ca	blk-merge: warn if figured out segment number is bigger than nr_phys_segments We had seen lots of reports of this kind issue, so add one warnning in blk-merge, then it can be triggered easily and avoid to depend on warning/bug from drivers. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-23 20:16:55 -07:00
Ming Lei	02e707424c	blk-merge: fix blk_bio_segment_split Commit bdced438acd83a(block: setup bi_phys_segments after splitting) introduces function of computing bio->bi_phys_segments during bio splitting. Unfortunately both bio->bi_seg_front_size and bio->bi_seg_back_size arn't computed, so too many physical segments may be obtained for one request since both the two are used to check if one segment across two bios can be possible. This patch fixes the issue by computing the two variables in blk_bio_segment_split(). Fixes: bdced438acd83a(block: setup bi_phys_segments after splitting) Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Mark Salter <msalter@redhat.com> Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Tested-by: Mark Salter <msalter@redhat.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-23 20:16:53 -07:00
Ming Lei	578270bfbd	block: fix segment split Inside blk_bio_segment_split(), previous bvec pointer(bvprvp) always points to the iterator local variable, which is obviously wrong, so fix it by pointing to the local variable of 'bvprv'. Fixes: 5014c311baa2b(block: fix bogus compiler warnings in blk-merge.c) Cc: stable@kernel.org #4.3 Reported-by: Michael Ellerman <mpe@ellerman.id.au> Reported-by: Mark Salter <msalter@redhat.com> Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Tested-by: Mark Salter <msalter@redhat.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2015-11-23 20:16:51 -07:00
Jan Kara	c2489e07c0	vfs: Avoid softlockups with sendfile(2) The following test program from Dmitry can cause softlockups or RCU stalls as it copies 1GB from tmpfs into eventfd and we don't have any scheduling point at that path in sendfile(2) implementation: int r1 = eventfd(0, 0); int r2 = memfd_create("", 0); unsigned long n = 1<<30; fallocate(r2, 0, 0, n); sendfile(r1, r2, 0, n); Add cond_resched() into __splice_from_pipe() to fix the problem. CC: Dmitry Vyukov <dvyukov@google.com> CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-11-23 21:15:30 -05:00
Jan Kara	c725bfce79	vfs: Make sendfile(2) killable even better Commit `296291cdd1` (mm: make sendfile(2) killable) fixed an issue where sendfile(2) was doing a lot of tiny writes into a filesystem and thus was unkillable for a long time. However sendfile(2) can be (mis)used to issue lots of writes into arbitrary file descriptor such as evenfd or similar special file descriptors which never hit the standard filesystem write path and thus are still unkillable. E.g. the following example from Dmitry burns CPU for ~16s on my test system without possibility to be killed: int r1 = eventfd(0, 0); int r2 = memfd_create("", 0); unsigned long n = 1<<30; fallocate(r2, 0, 0, n); sendfile(r1, r2, 0, n); There are actually quite a few tests for pending signals in sendfile code however we data to write is always available none of them seems to trigger. So fix the problem by adding a test for pending signal into splice_from_pipe_next() also before the loop waiting for pipe buffers to be available. This should fix all the lockup issues with sendfile of the do-ton-of-tiny-writes nature. CC: stable@vger.kernel.org Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-11-23 21:15:30 -05:00
Al Viro	0ebf7f10d6	fix sysvfs symlinks The thing got broken back in 2002 - sysvfs does not have inline symlinks; even short ones have bodies stored in the first block of file. sysv_symlink() handles that correctly; unfortunately, attempting to look an existing symlink up will end up confusing them for inline symlinks, and interpret the block number containing the body as the body itself. Nobody has noticed until now, which says something about the level of testing sysvfs gets ;-/ Cc: stable@vger.kernel.org # all of them, not that anyone cared Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2015-11-23 21:11:08 -05:00
Linus Torvalds	a2931547ee	linux-kselftest-4.4-rc3 This update consists of one minor documentation fix and a fix to an existing test. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJWU1euAAoJEAsCRMQNDUMcC78P/3mIOPtVMRMHR0YwGA/MCavO +JhVbEJCsrVtg5aPRod1Psz3QU3ubqr37yAeDe7vCniJK1zDx0QBGXATv91dVGLz Fqjm6DZ1zJXrsSgoFhWZXtjicEI2khdMlzDsRD0vXNSDJATpWHRVa9eLMeeZnIVA DXMH/RRlo7b4lK8/Kf2YV190mqemMsJRF2PfUAiZ1ZqBd8hCnqsk0hYdkJNaIDfJ PydtUCDLbXuvjg3AfGaBndifudzRFzb/lYyQ9K3KPHj2cE5TMHCPn2jTZwJ5V3cZ IX+LtYtxEZu+gCz/3l9kN9QDzy0EVeozvPGgg8gY/YLmKinQVENBuVXV4+vR696y h/LtJm7NdVyy4fopI6YBTEvaq7TKeNQWKjnQ7p5clqMCchY1/9aSgbAVIMgw5OFb DPNnclcfWmVEMpzbmeyMTmfAbcqmttmQXAaklXH6WrcQ/C9KEWfMzexvY4ho/eur daIl7A3MyB83Z5bjUsryhVeNunPecklshE1wMwrmutnDIH8Wj+eJM6yHBJf/cgbO AnhKRcsqzkti0QXdlzEMRWfDWAfkzCXSbdjcORnRFV4Dw2X7RgizFXtfI6xccVxS AO4dtkNKbXUOt184XZlwrES+IXhtnlqBTO1HX/clQ2F7FVeT6Sq1eYuAlVugDH8H 65mZzXyxAAfcjctk4U/r =8sYr -----END PGP SIGNATURE----- Merge tag 'linux-kselftest-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "This update consists of one minor documentation fix and a fix to an existing test" * tag 'linux-kselftest-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/seccomp: Get page size from sysconf tools:testing/selftests: fix typo in futex/README	2015-11-23 13:19:27 -08:00
Mike Snitzer	0fcb04d593	dm thin: fix regression in advertised discard limits When establishing a thin device's discard limits we cannot rely on the underlying thin-pool device's discard capabilities (which are inherited from the thin-pool's underlying data device) given that DM thin devices must provide discard support even when the thin-pool's underlying data device doesn't support discards. Users were exposed to this thin device discard limits regression if their thin-pool's underlying data device does _not_ support discards. This regression caused all upper-layers that called the blkdev_issue_discard() interface to not be able to issue discards to thin devices (because discard_granularity was 0). This regression wasn't caught earlier because the device-mapper-test-suite's extensive 'thin-provisioning' discard tests are only ever performed against thin-pool's with data devices that support discards. Fix is to have thin_io_hints() test the pool's 'discard_enabled' feature rather than inferring whether or not a thin device's discard support should be enabled by looking at the thin-pool's discard_granularity. Fixes: `216076705` ("dm thin: disable discard support for thin devices if pool's is disabled") Reported-by: Mike Gerber <mike@sprachgewalt.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Cc: stable@vger.kernel.org # 4.1+	2015-11-23 14:54:46 -05:00
Linus Torvalds	1ec218373b	Linux 4.4-rc2	2015-11-22 16:45:59 -08:00
Linus Torvalds	104e2a6f8b	Merge branch 'akpm' (patches from Andrew) Merge slub bulk allocator updates from Andrew Morton: "This missed the merge window because I was waiting for some repairs to come in. Nothing actually uses the bulk allocator yet and the changes to other code paths are pretty small. And the net guys are waiting for this so they can start merging the client code" More comments from Jesper Dangaard Brouer: "The kmem_cache_alloc_bulk() call, in mm/slub.c, were included in previous kernel. The present version contains a bug. Vladimir Davydov noticed it contained a bug, when kernel is compiled with CONFIG_MEMCG_KMEM (see commit `03ec0ed57f`: "slub: fix kmem cgroup bug in kmem_cache_alloc_bulk"). Plus the mem cgroup counterpart in kmem_cache_free_bulk() were missing (see commit `033745189b` "slub: add missing kmem cgroup support to kmem_cache_free_bulk"). I don't consider the fix stable-material because there are no in-tree users of the API. But with known bugs (for memcg) I cannot start using the API in the net-tree" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: slab/slub: adjust kmem_cache_alloc_bulk API slub: add missing kmem cgroup support to kmem_cache_free_bulk slub: fix kmem cgroup bug in kmem_cache_alloc_bulk slub: optimize bulk slowpath free by detached freelist slub: support for bulk free with SLUB freelists	2015-11-22 15:21:40 -08:00
Linus Torvalds	dcfeda9d5f	TTY/Serial fixes for 4.4-rc2 Here are a few small tty/serial driver fixes for 4.4-rc2 that resolve some reported problems. All have been in linux-next, full details are in the shortlog below. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlZSCjgACgkQMUfUDdst+yl4kQCgyYYsaVVUcG2i3HQUpio4CAJg EFQAn03Z0OD/EGNHKw7FtsICSgAhSatG =JRcn -----END PGP SIGNATURE----- Merge tag 'tty-4.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are a few small tty/serial driver fixes for 4.4-rc2 that resolve some reported problems. All have been in linux-next, full details are in the shortlog below" * tag 'tty-4.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: export fsl8250_handle_irq serial: 8250_mid: Add missing dependency tty: audit: Fix audit source serial: etraxfs-uart: Fix crash serial: fsl_lpuart: Fix earlycon support bcm63xx_uart: Use the device name when registering an interrupt tty: Fix direct use of tty buffer work tty: Fix tty_send_xchar() lock order inversion	2015-11-22 15:10:57 -08:00
Linus Torvalds	7f21739301	Staging/IIO fixes for 4.4-rc2 Here are some staging and iio driver fixes for 4.4-rc2. All of these are in response to issues that have been reported and have been in linux-next for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlZSCfgACgkQMUfUDdst+yn0cwCeI7I/i1MUaiequc4gSOoElqfa RCYAnj+PTk+BcbACs0mgZKbYuEEd1eVT =nUNA -----END PGP SIGNATURE----- Merge tag 'staging-4.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/IIO fixes from Greg KH: "Here are some staging and iio driver fixes for 4.4-rc2. All of these are in response to issues that have been reported and have been in linux-next for a while" * tag 'staging-4.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: Revert "Staging: wilc1000: coreconfigurator: Drop unneeded wrapper functions" iio: adc: xilinx: Fix VREFN scale iio: si7020: Swap data byte order iio: adc: vf610_adc: Fix division by zero error iio:ad7793: Fix ad7785 product ID iio: ad5064: Fix ad5629/ad5669 shift iio:ad5064: Make sure ad5064_i2c_write() returns 0 on success iio: lpc32xx_adc: fix warnings caused by enabling unprepared clock staging: iio: select IRQ_WORK for IIO_DUMMY_EVGEN vf610_adc: Fix internal temperature calculation	2015-11-22 13:26:24 -08:00
Linus Torvalds	6d2d91b3e4	USB fixes for 4.4-rc2 Here are a number of USB fixes and new device ids for 4.4-rc2. All have been in linux-next and the details are in the shortlog. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlZSDW4ACgkQMUfUDdst+ymrlwCgha5PobWYrhVnhC/w5ODZxRaF oAQAn2tOK94L9sADvjbQlFUy+/Zaxxbj =x9f4 -----END PGP SIGNATURE----- Merge tag 'usb-4.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a number of USB fixes and new device ids for 4.4-rc2. All have been in linux-next and the details are in the shortlog" * tag 'usb-4.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits) usblp: do not set TASK_INTERRUPTIBLE before lock USB: MAINTAINERS: cxacru usb: kconfig: fix warning of select USB_OTG USB: option: add XS Stick W100-2 from 4G Systems xhci: Fix a race in usb2 LPM resume, blocking U3 for usb2 devices usb: xhci: fix checking ep busy for CFC xhci: Workaround to get Intel xHCI reset working more reliably usb: chipidea: imx: fix a possible NULL dereference usb: chipidea: usbmisc_imx: fix a possible NULL dereference usb: chipidea: otg: gadget module load and unload support usb: chipidea: debug: disable usb irq while role switch ARM: dts: imx27.dtsi: change the clock information for usb usb: chipidea: imx: refine clock operations to adapt for all platforms usb: gadget: atmel_usba_udc: Expose correct device speed usb: musb: enable usb_dma parameter usb: phy: phy-mxs-usb: fix a possible NULL dereference usb: dwc3: gadget: let us set lower max_speed usb: musb: fix tx fifo flush handling usb: gadget: f_loopback: fix the warning during the enumeration usb: dwc2: host: Fix remote wakeup when not in DWC2_L2 ...	2015-11-22 13:15:05 -08:00
Linus Torvalds	0ec7dc8d19	Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS fixes from Ralf Baechle: - Fix a flood of annoying build warnings - A number of fixes for Atheros 79xx platforms * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: ath79: Add a machine entry for booting OF machines MIPS: ath79: Fix the size of the MISC INTC registers in ar9132.dtsi MIPS: ath79: Fix the DDR control initialization on ar71xx and ar934x MIPS: Fix flood of warnings about comparsion being always true.	2015-11-22 12:59:46 -08:00
Linus Torvalds	94521b2fd2	Merge branch 'parisc-4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc update from Helge Deller: "This patchset adds Huge Page and HUGETLBFS support for parisc" Honestly, the hugepage support should have gone through in the merge window, and is not really an rc-time fix. But it only touches arch/parisc, and I cannot find it in myself to care. If one of the three parisc users notices a breakage, I will point at Helge and make rude farting noises. * 'parisc-4.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Map kernel text and data on huge pages parisc: Add Huge Page and HUGETLBFS support parisc: Use long branch to do_syscall_trace_exit parisc: Increase initial kernel mapping to 32MB on 64bit kernel parisc: Initialize the fault vector earlier in the boot process. parisc: Add defines for Huge page support parisc: Drop unused MADV_xxxK_PAGES flags from asm/mman.h parisc: Drop definition of start_thread_som for HP-UX SOM binaries parisc: Fix wrong comment regarding first pmd entry flags	2015-11-22 12:50:58 -08:00
Linus Torvalds	727cde6c3a	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tool fixes from Thomas Gleixner: "A couple of fixes for perf tools: - Build system updates - Plug a memory leak in an error path of perf probe - Tear down probes correctly when adding fails - Fixes to the perf symbol handling - Fix ordering of event processing in buildid-list - Fix per DSO filtering in the histogram browser" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf probe: Clear probe_trace_event when add_probe_trace_event() fails perf probe: Fix memory leaking on failure by clearing all probe_trace_events perf inject: Also re-pipe lost_samples event perf buildid-list: Requires ordered events perf symbols: Fix dso lookup by long name and missing buildids perf symbols: Allow forcing reading of non-root owned files by root perf hists browser: The dso can be obtained from popup_action->ms.map->dso perf hists browser: Fix 'd' hotkey action to filter by DSO perf symbols: Rebuild rbtree when adjusting symbols for kcore tools: Add a "make all" rule tools: Actually install tmon in the install rule	2015-11-22 12:37:20 -08:00
Linus Torvalds	069ec22915	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "This update contains: - MPX updates for handling 32bit processes - A fix for a long standing bug in 32bit signal frame handling related to FPU/XSAVE state - Handle get_xsave_addr() correctly in KVM - Fix SMAP check under paravirtualization - Add a comment to the static function trace entry to avoid further confusion about the difference to dynamic tracing" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Fix SMAP check in PVOPS environments x86/ftrace: Add comment on static function tracing x86/fpu: Fix get_xsave_addr() behavior under virtualization x86/fpu: Fix 32-bit signal frame handling x86/mpx: Fix 32-bit address space calculation x86/mpx: Do proper get_user() when running 32-bit binaries on 64-bit kernels	2015-11-22 12:00:12 -08:00
Jesper Dangaard Brouer	865762a811	slab/slub: adjust kmem_cache_alloc_bulk API Adjust kmem_cache_alloc_bulk API before we have any real users. Adjust API to return type 'int' instead of previously type 'bool'. This is done to allow future extension of the bulk alloc API. A future extension could be to allow SLUB to stop at a page boundary, when specified by a flag, and then return the number of objects. The advantage of this approach, would make it easier to make bulk alloc run without local IRQs disabled. With an approach of cmpxchg "stealing" the entire c->freelist or page->freelist. To avoid overshooting we would stop processing at a slab-page boundary. Else we always end up returning some objects at the cost of another cmpxchg. To keep compatible with future users of this API linking against an older kernel when using the new flag, we need to return the number of allocated objects with this API change. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-11-22 11:58:44 -08:00

1 2 3 4 5 ...

560994 Commits