linux

Author	SHA1	Message	Date
Mike Snitzer	63f6e6fd05	dm mpath: remove unused param from multipath_init_per_bio_data() 'struct dm_bio_details *' isn't ever needed. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-20 10:51:12 -05:00
Mike Snitzer	978e51ba38	dm: optimize bio-based NVMe IO submission Upper level bio-based drivers that stack immediately ontop of NVMe can leverage direct_make_request(). In addition DM's NVMe bio-based will initially only ever have one NVMe device that it submits IO to at a time. There is no splitting needed. Enhance DM core so that DM_TYPE_NVME_BIO_BASED's IO submission takes advantage of both of these characteristics. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-20 10:51:11 -05:00
Mike Snitzer	22c11858e8	dm: introduce DM_TYPE_NVME_BIO_BASED If dm_table_determine_type() establishes DM_TYPE_NVME_BIO_BASED then all devices in the DM table do not support partial completions. Also, the table has a single immutable target that doesn't require DM core to split bios. This will enable adding NVMe optimizations to bio-based DM. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-20 10:51:10 -05:00
Mike Snitzer	f3986374f9	dm: simplify start of block stats accounting for bio-based No apparent need to generic_start_io_acct() until before the IO is ready for submission. start_io_acct() is the proper place to do this accounting -- it is also where DM accounts for pending IO and, if enabled, starts dm-stats accounting. Replace start_io_acct()'s part_round_stats() with generic_start_io_acct(). This eliminates needing to take part_stat_lock() multiple times when starting an IO on bio-based devices. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-17 12:05:32 -05:00
Mike Snitzer	bc02cdbe53	dm: remove redundant mapped_device member from clone_info structure 'struct dm_io' already has the same pointer. So update all accesses from ci->md to ci->io->md. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-16 20:43:15 -05:00
Mike Snitzer	dde1e1ec4c	dm: remove now unused bio-based io_pool and _io_cache Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-16 20:43:14 -05:00
Mike Snitzer	64f52b0e31	dm: improve performance by moving dm_io structure to per-bio-data Eliminates need for a separate mempool to allocate 'struct dm_io' objects from. As such, it saves an extra mempool allocation for each original bio that DM core is issued. This complicates the per-bio-data accessor functions by needing to conditonally add extra padding to get to a target's per-bio-data. But in the end this provides a decent performance improvement for all bio-based DM devices. On an NVMe-loop based testbed to a ramdisk (~3100 MB/s): bio-based DM linear performance improved by 2% (went from 2665 to 2777 MB/s). Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-16 20:43:13 -05:00
Mike Snitzer	745dc570b2	dm: rename 'bio' member of dm_io structure to 'orig_bio' Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-16 20:43:12 -05:00
Mike Snitzer	2abf1fc91d	dm: remove stale comment blocks These CRUD comments have worn out their welcome. The code is what it is, over time it'll hopefully get better. But these comments serve no purpose whatsoever. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-16 20:43:11 -05:00
Mike Snitzer	ad3793fc39	dm: set QUEUE_FLAG_DAX accordingly in dm_table_set_restrictions() Rather than having DAX support be unique by setting it based on table type in dm_setup_md_queue(). Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:33:32 -05:00
Mike Snitzer	3d7f45625a	dm: fix __send_changing_extent_only() to send first bio and chain remainder __send_changing_extent_only() must follow the same pattern that was established with commit "dm: ensure bio submission follows a depth-first tree walk". That is: submit first bio up to split boundary and then split the remainder to further submissions. Suggested-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:16:01 -05:00
Mike Snitzer	0776aa0e30	dm: ensure bio-based DM's bioset and io_pool support targets' maximum IOs alloc_multiple_bios() assumes it can allocate the requested number of bios but until now there was no gaurantee that the mempools would be accomodating. Suggested-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:16:00 -05:00
Mike Snitzer	4a3f54d94d	dm: remove BIOSET_NEED_RESCUER based dm_offload infrastructure Now that all of DM has been revised and/or verified to no longer require the use of BIOSET_NEED_RESCUER the dm_offload code may be removed. Suggested-by: NeilBrown <neilb@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:15:59 -05:00
Mike Snitzer	318716ddea	dm: safely allocate multiple bioset bios DM targets can request multiple bios be sent to them by DM core (see: num_{flush,discard,write_same,write_zeroes}_bios). But until now these bios were allocated in an unsafe manner than could potentially exhaust the DM device's bioset -- in the face of multiple threads each trying to do multiple allocations from the same DM device's bioset. Fix __send_duplicate_bios() by using the new alloc_multiple_bios(). The allocation strategy used by alloc_multiple_bios() models that used by dm-crypt.c:crypt_alloc_buffer(). Neil Brown initially proposed this fix but the implementation has been revised enough that it inappropriate to attribute the entirety of it to him. Suggested-by: NeilBrown <neilb@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:15:58 -05:00
NeilBrown	f31c21e436	dm: remove unused 'num_write_bios' target interface No DM target provides num_write_bios and none has since dm-cache's brief use in 2013. Having the possibility of num_write_bios > 1 complicates bio allocation. So remove the interface and assume there is only one bio needed. If a target ever needs more, it must provide a suitable bioset and allocate itself based on its particular needs. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:15:58 -05:00
NeilBrown	18a25da843	dm: ensure bio submission follows a depth-first tree walk A dm device can, in general, represent a tree of targets, each of which handles a sub-range of the range of blocks handled by the parent. The bio sequencing managed by generic_make_request() requires that bios are generated and handled in a depth-first manner. Each call to a make_request_fn() may submit bios to a single member device, and may submit bios for a reduced region of the same device as the make_request_fn. In particular, any bios submitted to member devices must be expected to be processed in order, so a later one must never wait for an earlier one. This ordering is usually achieved by using bio_split() to reduce a bio to a size that can be completely handled by one target, and resubmitting the remainder to the originating device. bio_queue_split() shows the canonical approach. dm doesn't follow this approach, largely because it has needed to split bios since long before bio_split() was available. It currently can submit bios to separate targets within the one dm_make_request() call. Dependencies between these targets, as can happen with dm-snap, can cause deadlocks if either bios gets stuck behind the other in the queues managed by generic_make_request(). This requires the 'rescue' functionality provided by dm_offload_{start,end}. Some of this requirement can be removed by changing the order of bio submission to follow the canonical approach. That is, if dm finds that it needs to split a bio, the remainder should be sent to generic_make_request() rather than being handled immediately. This delays the handling until the first part is completely processed, so the deadlock problems do not occur. __split_and_process_bio() can be called both from dm_make_request() and from dm_wq_work(). When called from dm_wq_work() the current approach is perfectly satisfactory as each bio will be processed immediately. When called from dm_make_request(), current->bio_list will be non-NULL, and in this case it is best to create a separate "clone" bio for the remainder. When we use bio_clone_bioset() to split off the front part of a bio and chain the two together and submit the remainder to generic_make_request(), it is important that the newly allocated bio is used as the head to be processed immediately, and the original bio gets "bio_advance()"d and sent to generic_make_request() as the remainder. Otherwise, if the newly allocated bio is used as the remainder, and if it then needs to be split again, then the next bio_clone_bioset() call will be made while holding a reference a bio (result of the first clone) from the same bioset. This can potentially exhaust the bioset mempool and result in a memory allocation deadlock. Note that there is no race caused by reassigning cio.io->bio after already calling __map_bio(). This bio will only be dereferenced again after dec_pending() has found io->io_count to be zero, and this cannot happen before the dec_pending() call at the end of __split_and_process_bio(). To provide the clone bio when splitting, we use q->bio_split. This was previously being freed by bio-based dm to avoid having excess rescuer threads. As bio_split bio sets no longer create rescuer threads, there is little cost and much gain from restoring the q->bio_split bio set. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:15:57 -05:00
NeilBrown	c110a4b6e6	dm io: remove BIOSET_NEED_RESCUER flag from bios bioset The BIOSET_NEED_RESCUER flag is only needed when a make_request_fn might do two allocations from the one bioset, and the second one could block until the first bio completes. dm_io() is called from make_request_fn() context. The closest it comes to multiple allocations is in chunk_io() in dm-snap-persistent. But there the code uses a separate thread to avoid problems. So BIOSET_NEED_RESCUER is not needed. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:15:56 -05:00
NeilBrown	80cd175783	dm crypt: remove BIOSET_NEED_RESCUER flag The BIOSET_NEED_RESCUER flag is only needed when a make_request_fn might do two allocations from the one bioset, and the second one could block until the first bio completes. dm-crypt does allocate from this bioset inside the dm make_request_fn, but does so using GFP_NOWAIT so that the allocation will not block. So BIOSET_NEED_RESCUER is not needed. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:15:55 -05:00
NeilBrown	c06b3e5837	dm: fix comment above dm_accept_partial_bio Clarify that dm_accept_partial_bio isn't allowed for REQ_OP_ZONE_RESET bios. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:15:54 -05:00
Heinz Mauelshagen	552aa679f2	dm raid: use rs_is_raid*() Cleanup, no functional change. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 12:15:47 -05:00
Heinz Mauelshagen	7c29744ecc	dm raid: simplify rs_get_progress() No need to calculate the reshaping progress because mddev->curr_resync_completed holds it. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 11:59:21 -05:00
Heinz Mauelshagen	dc15b943d4	dm raid: ensure 'a' chars during reshape During reshape, 'A' chars were reported in status rather than 'a'. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 11:57:36 -05:00
Heinz Mauelshagen	11e4723206	dm raid: stop keeping raid set frozen altogether In order to avoid redoing synchronization/recovery/reshape partially, the raid set got frozen until after all passed in table line flags had been cleared. The related table reload sequence had to be precisely followed, or reshaping may lead to data corruption caused by the active mapping carrying on with a reshape when the inactive mapping already had retrieved a stale reshape position. Harden by retrieving the actual resync/recovery/reshape position during resume whilst the active table is suspended thus avoiding to keep the raid set frozen altogether. This prevents superfluous redoing of an already resynchronized or recovered segment and, most importantly, potential for redoing of an already reshaped segment causing data corruption. Fixes: `d39f0010e` ("dm raid: fix raid_resume() to keep raid set frozen as needed") Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 11:52:02 -05:00
Heinz Mauelshagen	53bf5384f9	dm raid: validate current raid sets redundancy Verifying the current raid sets redundancy based on retrieved superblock content has to use the superblock's raid level (e.g. raid0), not the constructor requested one (e.g. raid10). Using the requested raid level of raid10 lead to a "divide error" on raid0 which defines data copies divided by to be zero. Also check for bogus data copies. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-13 11:50:52 -05:00
Mike Snitzer	b84cf26924	dm raid: bump target version to reflect numerous fixes Also update Documentation accordingly. Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:58 -05:00
Heinz Mauelshagen	78a75d10ef	dm raid: small cleanup and remove unsed "struct raid_set" member Move raid_resume()'s setting of 'rw' and 'in_sync' to just prior to mddev_resume(). Also, remove unused 'bitmap_loaded' member from "struct raid_set". No functional changes. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:58 -05:00
Heinz Mauelshagen	4102d9de6d	dm raid: fix rs_get_progress() synchronization state/ratio Fix various sync state issues causing racy/bogus sync ratio, sync_action ad health chars in dm_status() info output. Sync ratio could be N/N (i.e. 100%) shortly after raid set creation, i.e. creating a new RaidLV or upconverting a linear LV to raid1 thus: "0 2097152 raid raid1 2 Aa 2097162/2097152 recover 0 0 -" instead of: "0 2097152 raid raid1 2 Aa 0/2097152 idle 0 0 -" Sync action could be non-idle, when the MD thread was done with io. Health chars could be 'A' when they should be 'a' for a short time before a resynchonization started. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:58 -05:00
Heinz Mauelshagen	242ea5ad11	dm raid: avoid passing array_in_sync variable to raid_status() callees The raid_status() function passes the bool array_in_sync variable around providing synchronization state of the MD array. Replace it with a runtime flag. This will avoid a pattern of having to pass discrete variables to various functions. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:58 -05:00
Heinz Mauelshagen	67143510a7	dm raid: display a consistent copy of the MD status via raid_status() The MD sync thread updates recovery flags providing state of any running, idle, frozen, recovering, reshaping, ... activity it performs and updates respective flags asynchronously versus dm processing raid_status(). To close that race window, take a single copy of the flags and pass it into its callees. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:58 -05:00
Heinz Mauelshagen	d39f0010e4	dm raid: fix raid_resume() to keep raid set frozen as needed During a reshape request: if userspace reloads a "raid" table multiple times, resulting in multiple superblock reads, the raid set needs to stay frozen until all config changes (chunk size, layout data_offset, delta_disks) have been stored in the superblocks and respective flags cleared. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:57 -05:00
Heinz Mauelshagen	188a212df1	dm raid: add component device size checks to avoid runtime failure Check all component data device sizes versus calculated size. Reject if device(s) are too small. Otherwise, MD will fail the operation by accessing beyond the end of the data device. An example use-case is that growing bitmap won't fit any more and the MD runtime will report an error when DM raid should catch this earlier. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:57 -05:00
Heinz Mauelshagen	61e06e2c3e	dm raid: fix raid set size revalidation The raid set size is being revalidated unconditionally before a reshaping conversion is started. MD requires the size to only be reduced in case of a stripe removing (i.e. shrinking) reshape but not when growing because the raid array has to stay small until after the growing reshape finishes. Fix by avoiding the size revalidation in preresume unless a shrinking reshape is requested. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:57 -05:00
Heinz Mauelshagen	7501537ee3	dm raid: correct resizing state relative to reshape space in ctr Pay attention to existing reshape space to define if a raid set needs resizing. Otherwise we can hit "Can't resize a reshaping raid set" when a reshape is being requested. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:57 -05:00
Heinz Mauelshagen	052b2b1e06	dm raid: consume sizes after md_finish_reshape() completes changing them The md raid personalities call md_finish_reshape() at the end of a reshape conversion which adjusts rdev->sectors. Correct/check rdev->sectors before initiating a reshape and raise the recovery pointer accordingly. Otherwise, the DM raid coordinated reshape will fail. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:57 -05:00
Heinz Mauelshagen	1af2048a3e	dm raid: fix deadlock caused by premature md_stop_writes() md_stop_writes() is called in raid_presuspend() causing deadlocks on bios submitted afterwards -- which happens on loaded raid sets with conversion requests. Fix by moving md_stop_writes() to raid_postsuspend(). NOTE: when the recovery's frozen (MD_RECOVERY_FROZEN), writes haven't been started (or are already stopped) so don't stop them again. Also remove superfluous readonly setting. Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:59:57 -05:00
Suren Baghdasaryan	fbc7c07ec2	dm bufio: fix shrinker scans when (nr_to_scan < retain_target) When system is under memory pressure it is observed that dm bufio shrinker often reclaims only one buffer per scan. This change fixes the following two issues in dm bufio shrinker that cause this behavior: 1. ((nr_to_scan - freed) <= retain_target) condition is used to terminate slab scan process. This assumes that nr_to_scan is equal to the LRU size, which might not be correct because do_shrink_slab() in vmscan.c calculates nr_to_scan using multiple inputs. As a result when nr_to_scan is less than retain_target (64) the scan will terminate after the first iteration, effectively reclaiming one buffer per scan and making scans very inefficient. This hurts vmscan performance especially because mutex is acquired/released every time dm_bufio_shrink_scan() is called. New implementation uses ((LRU size - freed) <= retain_target) condition for scan termination. LRU size can be safely determined inside __scan() because this function is called after dm_bufio_lock(). 2. do_shrink_slab() uses value returned by dm_bufio_shrink_count() to determine number of freeable objects in the slab. However dm_bufio always retains retain_target buffers in its LRU and will terminate a scan when this mark is reached. Therefore returning the entire LRU size from dm_bufio_shrink_count() is misleading because that does not represent the number of freeable objects that slab will reclaim during a scan. Returning (LRU size - retain_target) better represents the number of freeable objects in the slab. This way do_shrink_slab() returns 0 when (LRU size < retain_target) and vmscan will not try to scan this shrinker avoiding scans that will not reclaim any memory. Test: tested using Android device running <AOSP>/system/extras/alloc-stress that generates memory pressure and causes intensive shrinker scans Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:54:25 -05:00
Mike Snitzer	c1fd0abee0	dm mpath: fix bio-based multipath queue_if_no_path handling Commit `ca5beb76` ("dm mpath: micro-optimize the hot path relative to MPATHF_QUEUE_IF_NO_PATH") caused bio-based DM-multipath to fail mptest's "test_02_sdev_delete". Restoring the logic that existed prior to commit `ca5beb76` fixes this bio-based DM-multipath regression. Also verified all mptest tests pass with request-based DM-multipath. This commit effectively reverts commit `ca5beb76` -- but it does so without reintroducing the need to take the m->lock spinlock in must_push_back_{rq,bio}. Fixes: `ca5beb76` ("dm mpath: micro-optimize the hot path relative to MPATHF_QUEUE_IF_NO_PATH") Cc: stable@vger.kernel.org # 4.12+ Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-08 10:49:40 -05:00
monty_pavel@sina.com	7e6358d244	dm: fix various targets to dm_register_target after module __init resources created A NULL pointer is seen if two concurrent "vgchange -ay -K <vg name>" processes race to load the dm-thin-pool module: PID: 25992 TASK: ffff883cd7d23500 CPU: 4 COMMAND: "vgchange" #0 [ffff883cd743d600] machine_kexec at ffffffff81038fa9 0000001 [ffff883cd743d660] crash_kexec at ffffffff810c5992 0000002 [ffff883cd743d730] oops_end at ffffffff81515c90 0000003 [ffff883cd743d760] no_context at ffffffff81049f1b 0000004 [ffff883cd743d7b0] __bad_area_nosemaphore at ffffffff8104a1a5 0000005 [ffff883cd743d800] bad_area at ffffffff8104a2ce 0000006 [ffff883cd743d830] __do_page_fault at ffffffff8104aa6f 0000007 [ffff883cd743d950] do_page_fault at ffffffff81517bae `0000008` [ffff883cd743d980] page_fault at ffffffff81514f95 [exception RIP: kmem_cache_alloc+108] RIP: ffffffff8116ef3c RSP: ffff883cd743da38 RFLAGS: 00010046 RAX: 0000000000000004 RBX: ffffffff81121b90 RCX: ffff881bf1e78cc0 RDX: 0000000000000000 RSI: 00000000000000d0 RDI: 0000000000000000 RBP: ffff883cd743da68 R8: ffff881bf1a4eb00 R9: 0000000080042000 R10: 0000000000002000 R11: 0000000000000000 R12: 00000000000000d0 R13: 0000000000000000 R14: 00000000000000d0 R15: 0000000000000246 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 0000009 [ffff883cd743da70] mempool_alloc_slab at ffffffff81121ba5 0000010 [ffff883cd743da80] mempool_create_node at ffffffff81122083 0000011 [ffff883cd743dad0] mempool_create at ffffffff811220f4 0000012 [ffff883cd743dae0] pool_ctr at ffffffffa08de049 [dm_thin_pool] 0000013 [ffff883cd743dbd0] dm_table_add_target at ffffffffa0005f2f [dm_mod] 0000014 [ffff883cd743dc30] table_load at ffffffffa0008ba9 [dm_mod] 0000015 [ffff883cd743dc90] ctl_ioctl at ffffffffa0009dc4 [dm_mod] The race results in a NULL pointer because: Process A (vgchange -ay -K): a. send DM_LIST_VERSIONS_CMD ioctl; b. pool_target not registered; c. modprobe dm_thin_pool and wait until end. Process B (vgchange -ay -K): a. send DM_LIST_VERSIONS_CMD ioctl; b. pool_target registered; c. table_load->dm_table_add_target->pool_ctr; d. _new_mapping_cache is NULL and panic. Note: 1. process A and process B are two concurrent processes. 2. pool_target can be detected by process B but _new_mapping_cache initialization has not ended. To fix dm-thin-pool, and other targets (cache, multipath, and snapshot) with the same problem, simply dm_register_target() after all resources created during module init (as labelled with __init) are finished. Cc: stable@vger.kernel.org Signed-off-by: monty <monty_pavel@sina.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-04 10:23:10 -05:00
Mike Snitzer	afc567a497	dm table: fix regression from improper dm_dev_internal.count refcount_t conversion Multiple refcounts are needed if the device was already added. The micro-optimization of setting the refcount to 1 on first added (rather than fall thru to a common refcount_inc) lost sight of the fact that the refcount_inc is also needed for the case when the device already exists and the mode need not be upgraded. Fixes: `2a0b4682e0` ("dm: convert dm_dev_internal.count from atomic_t to refcount_t") Reported-by: Zdenek Kabelac <zkabelac@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2017-12-04 10:23:10 -05:00
Linus Torvalds	ae64f9bd1d	Linux 4.15-rc2	2017-12-03 11:01:47 -05:00
Linus Torvalds	87fc5c686e	Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm Pull ARM fix from Russell King: "Just one fix this time around, for the late commit in the merge window that triggered a problem with qemu. Qemu is apparently also going to receive a fix for the discovered issue" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: avoid faulting on qemu	2017-12-03 10:51:08 -05:00
Linus Torvalds	ae4806a38b	Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Here are two bugfixes for I2C, fixing a memleak in the core and irq allocation for i801. Also three bugfixes for the at24 eeprom driver which Bartosz collected while taking over maintainership for this driver" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: eeprom: at24: check at24_read/write arguments eeprom: at24: fix reading from 24MAC402/24MAC602 eeprom: at24: correctly set the size for at24mac402 i2c: i2c-boardinfo: fix memory leaks on devinfo i2c: i801: Fix Failed to allocate irq -2147483648 error	2017-12-03 10:48:24 -05:00
Linus Torvalds	49a418d783	hwmon fixes for v4.15-rc2 Drop reference to obsolete maintainer tree Fix overflow bug in pmbus driver Fix SMBUS timeout problem in jc42 driver For the SMBUS timeout handling, we had a brief discussion if this should be considered a bug fix or a feature. Peter says "it fixes real problems where the application misbehave due to faulty content when reading from an eeprom", and he needs the patch in his company's v4.14 images. This is good enough for me and warrants backport to stable kernels. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJaItAeAAoJEMsfJm/On5mBYrQQAJz+Ukg8blLKg1bqb01nZxEY 4vOxqZySpqR5icW6KP0zn/ws8eKyFjGQQsRxoRePo9Zfj2Y9RKRKVOoRrs7McZ6s mO12KJBf13cGiB+Msm8JsjJv81E15RnDwsWw39+SPms3ueKiBDAl4xaH6PSeGKTZ zB8teDk7MLLwCplRuNbB3qrc0BCj4AgeAu3omHO7PpKClOCHRieJPLaFE2Fpzsu/ P4RxUn4CFY0urgWJ5b9g5A3FdH8lOz8nfkiWPPnEb/IF+8tR9M3GYzwOp5r2uVul uKszDMKKx3Q+Hi+67/Ou2uLhCDnaxYtFHiN+REB9dRi33BHSuIsc4riCqa1ZMz8c XWQLbQq3u0bS1XgiegD4nihF2iNrj0fMcy2dcnUVWJNFKrfHjGAIodY0cKZJsZKW RqzWlX/aUVIpCSKxtJm0xDNPJS5FqKXLCV0xsNMF2Mz2JosT8o4IARZNlY7Ap0Be kRiQMXA/3y2RyeOUi82YHM0MorMt4icmTT3ztRrJpVbM1MBiiX2SefZquai5RLgp o/qTOrJ0gD8XzBhwP8wQYP5BvpPX2UX3V0sjLcRDNakqaOaDuslAMtzZ0sNn37Ng R+sAJNuFkLZkzhsa8IZSidngWMLvGS3Zjh2N75v6HcEaLrVoK2p6rBJM82Dg8cmp 0GwZkjd72bdXIXuRLpri =5rYP -----END PGP SIGNATURE----- Merge tag 'hwmon-for-linus-v4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: "Fixes: - Drop reference to obsolete maintainer tree - Fix overflow bug in pmbus driver - Fix SMBUS timeout problem in jc42 driver For the SMBUS timeout handling, we had a brief discussion if this should be considered a bug fix or a feature. Peter says "it fixes real problems where the application misbehave due to faulty content when reading from an eeprom", and he needs the patch in his company's v4.14 images. This is good enough for me and warrants backport to stable kernels" * tag 'hwmon-for-linus-v4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (jc42) optionally try to disable the SMBUS timeout hwmon: (pmbus) Use 64bit math for DIRECT format values hwmon: Drop reference to Jean's tree	2017-12-03 10:46:16 -05:00
Wolfram Sang	edef30980d	AT24 fixes for v4.15 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAlofxdUACgkQEacuoBRx 13JvfA//bB7nUyvHlglfYMOq6z4O3L7IB5bbWs+Z5hoccR0nsnYPXwe3huIyuxBa vQLgztqpyzcsT3LYDWS7sD/NQQoHF0it2ZOSRl7pYo83I1KXeKcbimDp3NmurG89 kx8AEtdhD4XkP/E7IwCsYlO7xTms0d9hShoyy0+0/GB4St5u+NOip+zT3TQVjJBS +ChnHMala2WBQji0wXmfOwFGHGEeEZXx5ZdrIheEiedFgOV0k7r/9IwaHVfh+DUb Lyb8fRCqTWwUblyky8nybSAtl4ki4jU5FJhiPsa3tI0MO2Vt5kzwWHYCYyQXo8D5 BEYW1gsFY2R2SV/QF3SNmfWK+HQLgZl+MYzslCd25GiSDFD4mMhRvi85twig8mC8 w86oaY22IVPAh5aUeF7W1FyRdYmAESsG2gOOG5dyxf8XPeGL7IqaV+GJkj2tPTdC OQ9q2hnO08e10e7Nub4k6NCWrIXK4WdNSjyRJz7DE2bvhWtYtnlDB3pCSKpxRNp7 6aHOMKHnJyNsGIGYmfq7/Zyq511EtYux3xSAZa3NwnnukEE5CeHnYleO4IXXBxNU reVIq14QZ5AngT0QF7p+oE0evf2bqeNrv8i2UFF/qqtEA6mCbYnVsgu0+vzuy31t k1X/PkfAgQLdqI5TDDABP3y++PSfdWp07hIdlG9wPKVlwjFj2z8= =ldzq -----END PGP SIGNATURE----- Merge tag 'at24-4.15-fixes-for-wolfram' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into i2c/for-current Please consider pulling the following fixes for v4.15. While it doesn't fix any regression introduced in the v4.15 merge window, we have a feature in at24 since linux v4.8 - reading the mac address block from at24mac series - which turned out to be not working. This pull request contains changes that fix it together with a patch that hardens the read and write argument sanitization with out-of-bounds checks that were missing.	2017-12-03 15:55:20 +01:00
Linus Torvalds	2db767d988	NFS client fixes for Linux 4.15-rc2 Bugfixes: - NFSv4: Ensure gcc 4.4.4 can compile initialiser for "invalid_stateid" - SUNRPC: Allow connect to return EHOSTUNREACH - SUNRPC: Handle ENETDOWN errors -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlohwp4ACgkQ18tUv7Cl QOtq1A//RPOxJBPQsImfkVTiVzxZbS8k2/obJSZjPYoNozmywEJs9dnFYJVCFUGp l9AvRd/SjXOVjGovk6ZhDCY3xA2eP1XfOLiVg7EhpczPVCRNJ34BUT7hWyxnTLSz MKc1qLLfVaSjsLioO6YmdCPjiGC0KegrBKNlRlIbI+OjCq5aNJpz73Fb4mFgCp5M taERunf7X29WHxAVn0c3mhIHN7tpCi9SgfbMURBEKLNrzj7RxnRY07dT1S9Mg/Yg 4FWU9FIpAyk9C9we/LR9jUywZQ3GGJFFFTOo8RfyMB/LR9RACSXnbHjhI1nUEQTb R/NpBxlpvxEOapHdmw32jwj1fkY/WYlUiJekQhjEekp/HkFNdctQL8PjrhG6lIW7 eBfFqZ2RUhYF1OQ8k4o0pR60O2scH3/D7tZwpgnJMFSpQSMnPnU8K3gvn/B5Mi4f UPDHtfj3GlWCIIJq1RIqKN4mt4tPktatnTCLIzDmqNbwqISwxow1lxmSesNejULo MryXLLl5M3XegjokXs0d0hadoywswHRTAxXxQEZav0dKMcHq4F0NirVw+VOIyNCB CztIVFI5Czzo4h4x99lgN26bNTysGMvse2qiPkVVr0CZt2leyrZyTl9khvDe3C0t ijyq882b4LqibuQtnI3l/Pynrrowfp7fqYx7SO62VJjraBVYUzE= =eQyi -----END PGP SIGNATURE----- Merge tag 'nfs-for-4.15-2' of git://git.linux-nfs.org/projects/anna/linux-nfs Pull NFS client fixes from Anna Schumaker: "These patches fix a problem with compiling using an old version of gcc, and also fix up error handling in the SUNRPC layer. - NFSv4: Ensure gcc 4.4.4 can compile initialiser for "invalid_stateid" - SUNRPC: Allow connect to return EHOSTUNREACH - SUNRPC: Handle ENETDOWN errors" * tag 'nfs-for-4.15-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: SUNRPC: Handle ENETDOWN errors SUNRPC: Allow connect to return EHOSTUNREACH NFSv4: Ensure gcc 4.4.4 can compile initialiser for "invalid_stateid"	2017-12-01 20:04:20 -05:00
Linus Torvalds	788c1da05b	Changes since last update: - Fix memory leaks that appeared after removing ifork inline data buffer - Recover deferred rmap update log items in correct order - Fix memory leaks when buffer construction fails - Fix memory leaks when bmbt is corrupt - Fix some uninitialized variables and math problems in the quota scrubber - Add some omitted attribution tags on the log replay commit - Fix some UBSAN complaints about integer overflows with large sparse files - Implement an effective inode mode check in online fsck - Fix log's inability to retry quota item writeout due to transient errors -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCgAGBQJaIDZ8AAoJEPh/dxk0SrTrTD4QAIUq223XSyqMJYkAK163zMj4 PADY30MV7uMlFBLEm3b7ZEWA/vtFzDM7Qpa61WN15oR5jEVSqSFes9AzuLeISqia s7Hc1ksqgZLNaMnW+jQc4iT/yiCVhiWw3rFC4tahDVCF2lJO/la3ToUBbcoADAFk kBYVN1H1t5b+n5+A9QY6+Vxm6LXGPPo8vNyCQCEtN+dE7CcSEL4Ff9H9GmJiVPzk rG6uizwRvxZje/yY1jEnkCSI88Gj1v0L//VmIDDuGjCZleYxwbTQQO0l8p4S+Su8 48la8PZbk3KcBTfiRbcU0m4995DHDVT/mAOWHeZnv+ZI5jhDEe1lpJG5l65kwPK+ BOoTYaRaBv3yZvEOob6wEqyfT3A1dxXstKBJLPyHx+McqFH8+NV2WAry+6dedOkv Hwz6+OlAFmuBuhOZAZSt0LSWxu/qYovo5lCSNrBtiLlmDyFjtdbanQ7s8oWaV7p/ wimNV4Y+Y3XiePOEUftnG8yxOULZS4KMeYsdJxj9HzaKloYHQer+MWfPe0gzExBb eE3P9PckQpcx9hK8LE1irgDCDG6J2eb8b5sFZY0eNzngdtWCR/xYz3NFT+72kz3s XOI0mByH1Ab0Q1lvJml0RyW86Uj7lpMD2SzV2nVhbYrW81rkkzb7AQx5VyO57Gq6 WAX9mHNNRcY+uVrbb8QQ =oTB7 -----END PGP SIGNATURE----- Merge tag 'xfs-4.15-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Darrick Wong: "Here are some bug fixes for 4.15-rc2. - fix memory leaks that appeared after removing ifork inline data buffer - recover deferred rmap update log items in correct order - fix memory leaks when buffer construction fails - fix memory leaks when bmbt is corrupt - fix some uninitialized variables and math problems in the quota scrubber - add some omitted attribution tags on the log replay commit - fix some UBSAN complaints about integer overflows with large sparse files - implement an effective inode mode check in online fsck - fix log's inability to retry quota item writeout due to transient errors" * tag 'xfs-4.15-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Properly retry failed dquot items in case of error during buffer writeback xfs: scrub inode mode properly xfs: remove unused parameter from xfs_writepage_map xfs: ubsan fixes xfs: calculate correct offset in xfs_scrub_quota_item xfs: fix uninitialized variable in xfs_scrub_quota xfs: fix leaks on corruption errors in xfs_bmap.c xfs: fortify xfs_alloc_buftarg error handling xfs: log recovery should replay deferred ops in order xfs: always free inline data before resetting inode fork during ifree	2017-12-01 20:00:19 -05:00
Linus Torvalds	e1ba1c99da	RISC-V Cleanups and ABI Fixes for 4.15-rc2 This tag contains a handful of small cleanups that are a result of feedback that didn't make it into our original patch set, either because the feedback hadn't been given yet, I missed the original emails, or we weren't ready to submit the changes yet. I've been maintaining the various cleanup patch sets I have as their own branches, which I then merged together and signed. Each merge commit has a short summary of the changes, and each branch is based on your latest tag (4.15-rc1, in this case). If this isn't the right way to do this then feel free to suggest something else, but it seems sane to me. Here's a short summary of the changes, roughly in order of how interesting they are. * libgcc.h has been moved from include/lib, where it's the only member, to include/linux. This is meant to avoid tab completion conflicts. * VDSO entries for clock_get/gettimeofday/getcpu have been added. These are simple syscalls now, but we want to let glibc use them from the start so we can make them faster later. * A VDSO entry for instruction cache flushing has been added so userspace can flush the instruction cache. * The VDSO symbol versions for __vdso_cmpxchg{32,64} have been removed, as those VDSO entries don't actually exist. * __io_writes has been corrected to respect the given type. * A new READ_ONCE in arch_spin_is_locked(). * __test_and_op_bit_ord() is now actually ordered. * Various small fixes throughout the tree to enable allmodconfig to build cleanly. * Removal of some dead code in our atomic support headers. * Improvements to various comments in our atomic support headers. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAlohyvMTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRDvTKFQLMurQWkhD/wO/F8vrwsNMOWR8zxvHdB30KD+FHmr X1+X9OqnH8AMd4Woj6pS0ap7g0GCKuLiI/bOTrQVVdTJpmKFaJ9rrwRCJzHq43yt feRjKyPAFYlvf6YaIEJ3YHU0t3LO1eK27YyFMg6F8y+bZim6oK2GdyfYF0Xiik3B L3NkDPSH4oplTJjUI+tzDZdMsuZKhxpXPnbNQA7YZLepz04jOPGWqFrA1C3gAaVQ dj1OkOGTSyQFwia7LrIm2g0J5/mqpjAF0KjdiTsvH6G9x3V0HZYU5Br3kHgauWKc YrNEbbDl8EakT5QocPf5F4Z8qpO9Hvxjwe2/z27usPtV9FQOPuDDPOygSPykwNNJ bDfv9nIE3W7lN26BaRcV2ivY3r9ZpCEcq+qXIiTm3P/uTVqjMq54NkvHnj4ON1ih DJZEgkM9L+rm7c9XDn627FBkmkeEndPJcQ3P/nopb5zGTYTb2HGrUt2nM+KR2vuE FdYtA9+ll3OzyFO3OVVjiAlxr8Qnwf2wIWXJXxWpcmmchGJ5NeTSZtiD14pAP5eC EDpoWwefvhqRMGdOlgq/fkx4Mrhz27euWXine3ZccprABAf7Hxkb/N5ojIJKT7qW mN3HL3PC9P0t/HxQEu0q0NLLsP+X/1yZ5HmDl44Y7N8aeCrIUXaB61gsTt6Oi6Ha PMJi5PI6VDDQbA== =CCe+ -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-4.15-rc2_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux Pull RISC-V cleanups and ABI fixes from Palmer Dabbelt: "This contains a handful of small cleanups that are a result of feedback that didn't make it into our original patch set, either because the feedback hadn't been given yet, I missed the original emails, or we weren't ready to submit the changes yet. I've been maintaining the various cleanup patch sets I have as their own branches, which I then merged together and signed. Each merge commit has a short summary of the changes, and each branch is based on your latest tag (4.15-rc1, in this case). If this isn't the right way to do this then feel free to suggest something else, but it seems sane to me. Here's a short summary of the changes, roughly in order of how interesting they are. - libgcc.h has been moved from include/lib, where it's the only member, to include/linux. This is meant to avoid tab completion conflicts. - VDSO entries for clock_get/gettimeofday/getcpu have been added. These are simple syscalls now, but we want to let glibc use them from the start so we can make them faster later. - A VDSO entry for instruction cache flushing has been added so userspace can flush the instruction cache. - The VDSO symbol versions for __vdso_cmpxchg{32,64} have been removed, as those VDSO entries don't actually exist. - __io_writes has been corrected to respect the given type. - A new READ_ONCE in arch_spin_is_locked(). - __test_and_op_bit_ord() is now actually ordered. - Various small fixes throughout the tree to enable allmodconfig to build cleanly. - Removal of some dead code in our atomic support headers. - Improvements to various comments in our atomic support headers" * tag 'riscv-for-linus-4.15-rc2_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux: (23 commits) RISC-V: __io_writes should respect the length argument move libgcc.h to include/linux RISC-V: Clean up an unused include RISC-V: Allow userspace to flush the instruction cache RISC-V: Flush I$ when making a dirty page executable RISC-V: Add missing include RISC-V: Use define for get_cycles like other architectures RISC-V: Provide stub of setup_profiling_timer() RISC-V: Export some expected symbols for modules RISC-V: move empty_zero_page definition to C and export it RISC-V: io.h: type fixes for warnings RISC-V: use RISCV_{INT,SHORT} instead of {INT,SHORT} for asm macros RISC-V: use generic serial.h RISC-V: remove spin_unlock_wait() RISC-V: `sfence.vma` orderes the instruction cache RISC-V: Add READ_ONCE in arch_spin_is_locked() RISC-V: __test_and_op_bit_ord should be strongly ordered RISC-V: Remove smb_mb__{before,after}_spinlock() RISC-V: Remove __smp_bp__{before,after}_atomic RISC-V: Comment on why {,cmp}xchg is ordered how it is ...	2017-12-01 19:39:12 -05:00
Linus Torvalds	4b1967c90a	arm64 fixes: - Fix FP register corruption when SVE is not available or in use - Fix out-of-tree module build failure when CONFIG_ARM64_MODULE_PLTS=y - Missing 'const' generating errors with LTO builds - Remove unsupported events from Cortex-A73 PMU description - Removal of stale and incorrect comments -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCgAGBQJaIXOkAAoJELescNyEwWM0swYH/3iSLxKnGDht1M9xqa5V288z eNC/Vw/Y/Sqi305reRK6gWbJ0hwtJLYSEK3tDbeL6C9v9mg8CIZNzbPI3vrEjAq+ n8yKmJVYaXlu9jmmo7vqF7LZ7LRgKZPO0cEKWZBR8LAYjD0zJPikwDR/JvTkGH75 1VnFfwuMykB989NMcVGQ1eD2G5RH13e2j9D2ErT0fbdcZ/MWpcviVVqMr4ggsQoR imVozMPXXLQ/0LeUfr8IRIst3x0CgFwmMX7CDWoVJJJXB7Zq0nvNptEtlS5tUZ/x 1vbXJstFasG3EL6QKiKxfUvtbaa4Vm7xEBBIVABQij+iUw8Og1OBojVi0wBCE3s= =9hCV -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The critical one here is a fix for fpsimd register corruption across signals which was introduced by the SVE support code (the register files overlap), but the others are worth having as well. Summary: - Fix FP register corruption when SVE is not available or in use - Fix out-of-tree module build failure when CONFIG_ARM64_MODULE_PLTS=y - Missing 'const' generating errors with LTO builds - Remove unsupported events from Cortex-A73 PMU description - Removal of stale and incorrect comments" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: context: Fix comments and remove pointless smp_wmb() arm64: cpu_ops: Add missing 'const' qualifiers arm64: perf: remove unsupported events for Cortex-A73 arm64: fpsimd: Fix failure to restore FPSIMD state after signals arm64: pgd: Mark pgd_cache as __ro_after_init arm64: ftrace: emit ftrace-mod.o contents through code arm64: module-plts: factor out PLT generation code for ftrace arm64: mm: cleanup stale AIVIVT references	2017-12-01 19:37:03 -05:00
Palmer Dabbelt	3b62de26cf	RISC-V: Fixes for clean allmodconfig build Olaf said: Here's a short series of patches that produces a working allmodconfig. Would be nice to see them go in so we can add build coverage. I've dropped patches 8 and 10 from the original set: * [PATCH 08/10] (RISC-V: Set __ARCH_WANT_RENAMEAT to pick up generic version) has a better fix that I've sent out for review, we don't want renameat. * [PATCH 10/10] (input: joystick: riscv has get_cycles) has already been taken into Dmitry Torokhov's tree.	2017-12-01 13:31:31 -08:00
Palmer Dabbelt	185e788c84	move libgcc.h to include/linux	2017-12-01 13:16:15 -08:00

1 2 3 4 5 ...

722219 Commits