linux

Author	SHA1	Message	Date
Vasily Averin	3a2b19d1ee	lockd: lost rollback of set_grace_period() in lockd_down_net() Commit `efda760fe9` ("lockd: fix lockd shutdown race") is incorrect, it removes lockd_manager and disarm grace_period_end for init_net only. If nfsd was started from another net namespace lockd_up_net() calls set_grace_period() that adds lockd_manager into per-netns list and queues grace_period_end delayed work. These action should be reverted in lockd_down_net(). Otherwise it can lead to double list_add on after restart nfsd in netns, and to use-after-free if non-disarmed delayed work will be executed after netns destroy. Fixes: `efda760fe9` ("lockd: fix lockd shutdown race") Cc: stable@vger.kernel.org Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:11 -05:00
Vasily Averin	a3152f1440	lockd: added cleanup checks in exit_net hook Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Vasily Averin	b872285751	grace: replace BUG_ON by WARN_ONCE in exit_net hook Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Andrew Elble	4f34bd0540	nfsd: fix locking validator warning on nfs4_ol_stateid->st_mutex class The use of the st_mutex has been confusing the validator. Use the proper nested notation so as to not produce warnings. Signed-off-by: Andrew Elble <aweits@rit.edu> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Vasily Averin	e919b07652	lockd: remove net pointer from messages Publishing of net pointer is not safe, use net->ns.inum as net ID in debug messages [ 171.757678] lockd_up_net: per-net data created; net=f00001e7 [ 171.767188] NFSD: starting 90-second grace period (net f00001e7) [ 300.653313] lockd: nuking all hosts in net f00001e7... [ 300.653641] lockd: host garbage collection for net f00001e7 [ 300.653968] lockd: nlmsvc_mark_resources for net f00001e7 [ 300.711483] lockd_down_net: per-net data destroyed; net=f00001e7 [ 300.711847] lockd: nuking all hosts in net 0... [ 300.711847] lockd: host garbage collection for net 0 [ 300.711848] lockd: nlmsvc_mark_resources for net 0 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Vasily Averin	ba589528d6	nfsd: remove net pointer from debug messages Publishing of net pointer is not safe, replace it in debug meesages by net->ns.inum [ 119.989161] nfsd: initializing export module (net: f00001e7). [ 171.767188] NFSD: starting 90-second grace period (net f00001e7) [ 322.185240] nfsd: shutting down export module (net: f00001e7). [ 322.186062] nfsd: export shutdown complete (net: f00001e7). Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	03da3169c6	nfsd: Fix races with check_stateid_generation() The various functions that call check_stateid_generation() in order to compare a client-supplied stateid with the nfs4_stid state, usually need to atomically check for closed state. Those that perform the check after locking the st_mutex using nfsd4_lock_ol_stateid() should now be OK, but we do want to fix up the others. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	9271d7e509	nfsd: Ensure we check stateid validity in the seqid operation checks After taking the stateid st_mutex, we want to know that the stateid still represents valid state before performing any non-idempotent actions. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	beeca19cf1	nfsd: Fix race in lock stateid creation If we're looking up a new lock state, and the creation fails, then we want to unhash it, just like we do for OPEN. However in order to do so, we need to that no other LOCK requests can grab the mutex until we have unhashed it (and marked it as closed). Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	fd1fd685b3	nfsd4: move find_lock_stateid Trivial cleanup to simplify following patch. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	659aefb68e	nfsd: Ensure we don't recognise lock stateids after freeing them In order to deal with lookup races, nfsd4_free_lock_stateid() needs to be able to signal to other stateful functions that the lock stateid is no longer valid. Right now, nfsd_lock() will check whether or not an existing stateid is still hashed, but only in the "new lock" path. To ensure the stateid invalidation is also recognised by the "existing lock" path, and also by a second call to nfsd4_free_lock_stateid() itself, we can change the type to NFS4_CLOSED_STID under the stp->st_mutex. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	fb500a7cfe	nfsd: CLOSE SHOULD return the invalid special stateid for NFSv4.x (x>0) Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	d8a1a00055	nfsd: Fix another OPEN stateid race If nfsd4_process_open2() is initialising a new stateid, and yet the call to nfs4_get_vfs_file() fails for some reason, then we must declare the stateid closed, and unhash it before dropping the mutex. Right now, we unhash the stateid after dropping the mutex, and without changing the stateid type, meaning that another OPEN could theoretically look it up and attempt to use it. Reported-by: Andrew W Elble <aweits@rit.edu> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Trond Myklebust	15ca08d329	nfsd: Fix stateid races between OPEN and CLOSE Open file stateids can linger on the nfs4_file list of stateids even after they have been closed. In order to avoid reusing such a stateid, and confusing the client, we need to recheck the nfs4_stid's type after taking the mutex. Otherwise, we risk reusing an old stateid that was already closed, which will confuse clients that expect new stateids to conform to RFC7530 Sections 9.1.4.2 and 16.2.5 or RFC5661 Sections 8.2.2 and 18.2.4. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-11-27 16:45:10 -05:00
Jakub Kicinski	a39e17b2d8	bpf: offload: add a license header I forgot to add a license on kernel/bpf/offload.c. Luckily I'm still the only author so make it explicitly GPLv2. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-27 22:24:51 +01:00
Linus Torvalds	1751e8a6cb	Rename superblock flags (MS_xyz -> SB_xyz) This is a pure automated search-and-replace of the internal kernel superblock flags. The s_flags are now called SB_, with the names and the values for the moment mirroring the MS_ flags that they're equivalent to. Note how the MS_xyz flags are the ones passed to the mount system call, while the SB_xyz flags are what we then use in sb->s_flags. The script to do this was: # places to look in; re security/: it generally should not* be # touched (that stuff parses mount(2) arguments directly), but # there are two places where we really deal with superblock flags. FILES="drivers/mtd drivers/staging/lustre fs ipc mm \ include/linux/fs.h include/uapi/linux/bfs_fs.h \ security/apparmor/apparmorfs.c security/apparmor/include/lib.h" # the list of MS_... constants SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \ DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \ POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \ I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \ ACTIVE NOUSER" SED_PROG= for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done # we want files that contain at least one of MS_..., # with fs/namespace.c and fs/pnode.c excluded. L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done\| sort\|uniq\|grep -v '^fs/namespace.c'\|grep -v '^fs/pnode.c') for f in $L; do sed -i $f $SED_PROG; done Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-27 13:05:09 -08:00
Nicolas Pitre	abee210500	percpu: hack to let the CRIS architecture to boot until they clean up Commit `438a506180` ("percpu: don't forget to free the temporary struct pcpu_alloc_info") uncovered a problem on the CRIS architecture where the bootmem allocator is initialized with virtual addresses. Given it has: #define __va(x) ((void *)((unsigned long)(x) \| 0x80000000)) then things just work out because the end result is the same whether you give this a physical or a virtual address. Untill you call memblock_free_early(__pa(address)) that is, because values from __pa() don't match with the virtual addresses stuffed in the bootmem allocator anymore. Avoid freeing the temporary pcpu_alloc_info memory on that architecture until they fix things up to let the kernel boot like it did before. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: `438a506180` ("percpu: don't forget to free the temporary struct pcpu_alloc_info")	2017-11-27 12:53:12 -08:00
Thomas Meyer	141cbfba1d	auxdisplay: img-ascii-lcd: Only build on archs that have IOMEM This avoids the MODPOST error: ERROR: "devm_ioremap_resource" [drivers/auxdisplay/img-ascii-lcd.ko] undefined! Signed-off-by: Thomas Meyer <thomas@m3y3r.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-27 12:36:45 -08:00
Kirill A. Shutemov	152e93af3c	mm, thp: Do not make pmd/pud dirty without a reason Currently we make page table entries dirty all the time regardless of access type and don't even consider if the mapping is write-protected. The reasoning is that we don't really need dirty tracking on THP and making the entry dirty upfront may save some time on first write to the page. Unfortunately, such approach may result in false-positive can_follow_write_pmd() for huge zero page or read-only shmem file. Let's only make page dirty only if we about to write to the page anyway (as we do for small pages). I've restructured the code to make entry dirty inside maybe_p[mu]d_mkwrite(). It also takes into account if the vma is write-protected. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-27 12:26:29 -08:00
Kirill A. Shutemov	a8f9736645	mm, thp: Do not make page table dirty unconditionally in touch_p[mu]d() Currently, we unconditionally make page table dirty in touch_pmd(). It may result in false-positive can_follow_write_pmd(). We may avoid the situation, if we would only make the page table entry dirty if caller asks for write access -- FOLL_WRITE. The patch also changes touch_pud() in the same way. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-27 12:26:29 -08:00
Icenowy Zheng	04226916d2	media: usbtv: add a new usbid A new usbid of UTV007 is found in a newly bought device. The usbid is 1f71:3301. The ID on the chip is: UTV007 A89029.1 1520L18K1 Both video and audio is tested with the modified usbtv driver. Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Acked-by: Lubomir Rintel <lkundrak@v3.sk> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>	2017-11-27 14:49:18 -05:00
Arvind Yadav	20f9ceed72	pata_pdc2027x : make pdc2027x__timing structures const Make these pdc2027x__timing structures const as it is never modified. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-11-27 11:46:26 -08:00
Arvind Yadav	c1da86c19a	pata_pdc2027x: Remove unnecessary error check Here, The function pdc_hardware_init always return zero. So it is not necessary to check its return value. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-11-27 11:46:26 -08:00
Jon Maloy	2e724dca77	tipc: eliminate access after delete in group_filter_msg() KASAN revealed another access after delete in group.c. This time it found that we read the header of a received message after the buffer has been released. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-27 14:44:45 -05:00
Wang Long	ddf7005f32	debug cgroup: use task_css_set instead of rcu_dereference This macro `task_css_set` verifies that the caller is inside proper critical section if the kernel set CONFIG_PROVE_RCU=y. Signed-off-by: Wang Long <wanglong19@meituan.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-11-27 11:37:33 -08:00
Florian Fainelli	babd8a3e31	This pull request brings in a fix for a warning that started occuring when dtc from -next got merged. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE/JuuFDWp9/ZkuCBXtdYpNtH8nugFAloMj8kACgkQtdYpNtH8 nuiDkA/9GwCXgWzBPamys+OjBWCOmiU09XTAq5Z3e1fGB0EpuDihVOTGX3F/gYZU PHvp5QwDCErF9t0gFpBBaZeAcXGIGcKTilB9/ysO+gke4AdI+GRB/PRKu/NGjn0R ApLwg9CXlW7LFT/pWyp/+LWRRxThjsEE9qUfLB9+YXFBbFlAEW4MRdizweio9GDi EuNDbVn+D/1M8hzgf9sCJM85ZZ9+p6NKwjbvBRNcu7IV5EzgznYj2mGjTwPikVF5 z5FLaou1VbQ2gmvHMxiC7DkXClLINwf8xAtOUJt+hiZVeKjvCQ6g17zgtiS4tkU8 t8iJVnwBFk7Oq0llw0fqwHzmeQ71zzW8UYrIbfzFN1cGEajlzhgIEMzMNtPBJx3O 2XnkN0DOcbB5wsX5gmABCbLXA287m/2K/k4yRJRf76xg+APCcwgh3JyTvbUJy8Ul tYitPGOGUryy9C+ZWXEJdIhUQ/zApCKms5kxs6DnxRFA/QdJ/MJ1auNeFsAp3Ddr 49x7VYzIGuIyKsIG86y5P6wwIRr75cBO5qEKbh1g34NTaBLz8pIXLXq4P0ZWiGwJ I/yc2gTTGadyCS8XR2ZV8XMDZORe+O/qLQ4IFIAot9iybRw0RXySq0YVzAs93oWD KMbvkGoUALSEwjRbvhPaSCDSz434AwDOx9v+QJtjQySj1QlO/i4= =nA2b -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJaHGdhAAoJEIfQlpxEBwcEb4gP/jLM1pPlFvy/MeuoAO79jwKk Y0ew8fdg9OQAqulhqauh02R7QejGFChwdYajYNegRG0DBHGFnSepUmAbkrK9FQbr Zezago3/KI8hTLntGSzHnFsaEcNR7p+cgZKvT2AOTw3obV0a4fEeozETRwiliRrk bOvc/ETZNNUvqnYSPTrlFSemZ3sbF1nWzlziUrBYAQFErH+2MwpMHAQ8hZ70qUEe HA3vp9CCtY5COqPCpkGFvgoqN3lvhYdUpDVj3nyjk8tYajrwdM8I1AnqUzTXA83q 8JEUmL28A9+aMzFHKqLLcud3w08jdgdKJAAMfEVSyqwthFpOSaC/TdMfMqqh4H3u 2sea6tvGxpCxCiNKP0cN53gWJd5tPLhSVqYaFSc41OvvSZbc1YTb9joAbfwlVS9V /m2RE3U8jh9SGicbybLyOgu9WbuUM4qrVeEXAzJJhQ4rgAN8SOqpye12S8dSzg45 QsVlzdYwKkaAr3qJJklPXeFcADc6yTxUgwzEQlOHj46ij7+pLtTBm5dGZ0MlMEQm R9u+JFWnTjPv+IJ2HeEG+ZuOWnv3SZ4rxY5c4JqnQLNhZxHzaf93ubKV0W8L34kM /Magq4hiAeFq51knIELVZLyfrt/e4cZOF7T8tEWY3hFtSlvbuyBvXaOqng27BhvK icKoEdbViUWbrh1p2nbs =JQj1 -----END PGP SIGNATURE----- Merge tag 'bcm2835-dt-next-fixes-2017-11-15' into devicetree/fixes This pull request brings in a fix for a warning that started occuring when dtc from -next got merged. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>	2017-11-27 11:28:23 -08:00
Albert Pool	16a27dfd21	ata: mediatek: Fix typo in module description Signed-off-by: Albert Pool <albertpool@solcon.nl> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-11-27 11:26:59 -08:00
Florian Fainelli	5f1aa51c7a	ARM: dts: NSP: Fix PPI interrupt types Booting a kernel results in the kernel warning us about the following PPI interrupts configuration: [ 0.105127] smp: Bringing up secondary CPUs ... [ 0.110545] GIC: PPI11 is secure or misconfigured [ 0.110551] GIC: PPI13 is secure or misconfigured Fix this by using the appropriate edge configuration for PPI11 and PPI13, this is similar to what was fixed for Northstar (BCM5301X) in commit `0e34079cd1` ("ARM: dts: BCM5301X: Correct GIC_PPI interrupt flags"). Fixes: `7b2e987de2` ("ARM: NSP: add minimal Northstar Plus device tree") Fixes: `1a9d53caba` ("ARM: dts: NSP: Add TWD Support to DT") Acked-by: Jon Mason <jon.mason@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>	2017-11-27 11:22:29 -08:00
Florian Fainelli	77416ab35f	ARM: dts: NSP: Disable AHCI controller for HR NSP boards The AHCI controller is currently enabled for all of these boards: bcm958623hr and bcm958625hr would result in a hard hang on boot that we cannot get rid of. Since this does not appear to have an easy and simple fix, just disable the AHCI controller for now until this gets resolved. Fixes: `70725d6e97` ("ARM: dts: NSP: Enable SATA on bcm958625hr") Fixes: `d454c37624` ("ARM: dts: NSP: Add new DT file for bcm958623hr") Acked-by: Jon Mason <jon.mason@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>	2017-11-27 11:22:29 -08:00
Eduardo Otubo	5b5971df3b	xen-netfront: remove warning when unloading module v2: * Replace busy wait with wait_event()/wake_up_all() * Cannot garantee that at the time xennet_remove is called, the xen_netback state will not be XenbusStateClosed, so added a condition for that * There's a small chance for the xen_netback state is XenbusStateUnknown by the time the xen_netfront switches to Closed, so added a condition for that. When unloading module xen_netfront from guest, dmesg would output warning messages like below: [ 105.236836] xen:grant_table: WARNING: g.e. 0x903 still in use! [ 105.236839] deferring g.e. 0x903 (pfn 0x35805) This problem relies on netfront and netback being out of sync. By the time netfront revokes the g.e.'s netback didn't have enough time to free all of them, hence displaying the warnings on dmesg. The trick here is to make netfront to wait until netback frees all the g.e.'s and only then continue to cleanup for the module removal, and this is done by manipulating both device states. Signed-off-by: Eduardo Otubo <otubo@redhat.com> Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-27 14:21:58 -05:00
Jens Axboe	2967acbb25	blktrace: fix trace mutex deadlock A previous commit changed the locking around registration/cleanup, but direct callers of blk_trace_remove() were missed. This means that if we hit the error path in setup, we will deadlock on attempting to re-acquire the queue trace mutex. Fixes: `1f2cac107c` ("blktrace: fix unlocked access to init/start-stop/teardown") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-11-27 12:03:58 -07:00
Colin Ian King	66a7c84d67	i2c: i2c-boardinfo: fix memory leaks on devinfo Currently when an error occurs devinfo is still allocated but is unused when the error exit paths break out of the for-loop. Fix this by kfree'ing devinfo to avoid the leak. Detected by CoverityScan, CID#1416590 ("Resource Leak") Fixes: `4124c4eba4` ("i2c: allow attaching IRQ resources to i2c_board_info") Fixes: `0daaf99d84` ("i2c: copy device properties when using i2c_register_board_info()") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2017-11-27 19:14:29 +01:00
Hans de Goede	6e0c9507bf	i2c: i801: Fix Failed to allocate irq -2147483648 error On Apollo Lake devices the BIOS does not set up IRQ routing for the i801 SMBUS controller IRQ, so we end up with dev->irq set to IRQ_NOTCONNECTED. Detect this and do not try to use the irq in this case silencing: i801_smbus 0000:00:1f.1: Failed to allocate irq -2147483648: -107 Cc: stable@vger.kernel.org BugLink: https://communities.intel.com/thread/114759 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2017-11-27 19:11:27 +01:00
Darrick J. Wong	509955823c	xfs: log recovery should replay deferred ops in order As part of testing log recovery with dm_log_writes, Amir Goldstein discovered an error in the deferred ops recovery that lead to corruption of the filesystem metadata if a reflink+rmap filesystem happened to shut down midway through a CoW remap: "This is what happens [after failed log recovery]: "Phase 1 - find and verify superblock... "Phase 2 - using internal log " - zero log... " - scan filesystem freespace and inode maps... " - found root inode chunk "Phase 3 - for each AG... " - scan (but don't clear) agi unlinked lists... " - process known inodes and perform inode discovery... " - agno = 0 "data fork in regular inode 134 claims CoW block 376 "correcting nextents for inode 134 "bad data fork in inode 134 "would have cleared inode 134" Hou Tao dissected the log contents of exactly such a crash: "According to the implementation of xfs_defer_finish(), these ops should be completed in the following sequence: "Have been done: "(1) CUI: Oper (160) "(2) BUI: Oper (161) "(3) CUD: Oper (194), for CUI Oper (160) "(4) RUI A: Oper (197), free rmap [0x155, 2, -9] "Should be done: "(5) BUD: for BUI Oper (161) "(6) RUI B: add rmap [0x155, 2, 137] "(7) RUD: for RUI A "(8) RUD: for RUI B "Actually be done by xlog_recover_process_intents() "(5) BUD: for BUI Oper (161) "(6) RUI B: add rmap [0x155, 2, 137] "(7) RUD: for RUI B "(8) RUD: for RUI A "So the rmap entry [0x155, 2, -9] for COW should be freed firstly, then a new rmap entry [0x155, 2, 137] will be added. However, as we can see from the log record in post_mount.log (generated after umount) and the trace print, the new rmap entry [0x155, 2, 137] are added firstly, then the rmap entry [0x155, 2, -9] are freed." When reconstructing the internal log state from the log items found on disk, it's required that deferred ops replay in exactly the same order that they would have had the filesystem not gone down. However, replaying unfinished deferred ops can create /more/ deferred ops. These new deferred ops are finished in the wrong order. This causes fs corruption and replay crashes, so let's create a single defer_ops to handle the subsequent ops created during replay, then use one single transaction at the end of log recovery to ensure that everything is replayed in the same order as they're supposed to be. Reported-by: Amir Goldstein <amir73il@gmail.com> Analyzed-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2017-11-27 09:34:08 -08:00
Darrick J. Wong	98c4f78dcd	xfs: always free inline data before resetting inode fork during ifree In xfs_ifree, we reset the data/attr forks to extents format without bothering to free any inline data buffer that might still be around after all the blocks have been truncated off the file. Prior to commit `43518812d2` ("xfs: remove support for inlining data/extents into the inode fork") nobody noticed because the leftover inline data after truncation was small enough to fit inside the inline buffer inside the fork itself. However, now that we've removed the inline buffer, we /always/ have to free the inline data buffer or else we leak them like crazy. This test was found by turning on kmemleak for generic/001 or generic/388. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2017-11-27 09:33:25 -08:00
Adam Thomson	b7926c464d	ASoC: da7218: Correct IRQ level in DT binding example Current DT binding documentation shows an example where the IRQ for the device is chosen to be ACTIVE_HIGH. This is incorrect as the device only supports ACTIVE_LOW, so this commit fixes that discrepancy. Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>	2017-11-27 17:11:10 +00:00
Adam Thomson	d3b0535216	ASoC: da7219: Correct IRQ level in DT binding example Current DT binding documentation shows an example where the IRQ for the device is chosen to be ACTIVE_HIGH. This is incorrect as the device only supports ACTIVE_LOW, so this commit fixes that discrepancy. Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org>	2017-11-27 17:10:48 +00:00
Tal Shorer	c98a980509	workqueue: respect isolated cpus when queueing an unbound work Initialize wq_unbound_cpumask to exclude cpus that were isolated by the cmdline's isolcpus parameter. Signed-off-by: Tal Shorer <tal.shorer@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-11-27 08:57:00 -08:00
Tal Shorer	7d229c668a	main: kernel_start: move housekeeping_init() before workqueue_init_early() This is needed in order to allow the unbound workqueue to take housekeeping cpus into accounty Signed-off-by: Tal Shorer <tal.shorer@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-11-27 08:56:57 -08:00
Paolo Bonzini	a63dd7480d	PPC KVM fixes for 4.15 One commit here, that fixes a couple of bugs relating to the patch series that enables HPT guests to run on a radix host on POWER9 systems. This patch series went upstream in the 4.15 merge window, so no stable backport is required. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJaGKSnAAoJEJ2a6ncsY3GfF7MIANLLhznEMrWq8jw4g95WsJU1 MkDGwp8kIdhIOM9HD6JRskoJZB5Mws2BWlQ5PSaVFxO6v6eUgNLaRb/UBxC1r7gU 1f9/8corY4BNkezSdJqTL7Xgp13KjTU726OwYAqCPEyCSPEc9ciMyeIgyZuv2dPa Pju+u4tnA+9JJyskgNL+/ybOOZwVat91VmNUVRq29zP6+zo1tmIDxrQchy6Bqui/ 7Wg298G+yjAkJ8ktQu69ACk+0oEBGUOcLUlraqGSr9auR+b0nJ1PAGCDRaONdwgE +X+OE+t+UC6rU+coUXMwO+Id0X7HMdsLQd3066ODEtD55g8MIVZ126Wt8xDmj5o= =GSTh -----END PGP SIGNATURE----- Merge tag 'kvm-ppc-fixes-4.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master PPC KVM fixes for 4.15 One commit here, that fixes a couple of bugs relating to the patch series that enables HPT guests to run on a radix host on POWER9 systems. This patch series went upstream in the 4.15 merge window, so no stable backport is required.	2017-11-27 17:54:13 +01:00
Jan H. Schönherr	20b7035c66	KVM: Let KVM_SET_SIGNAL_MASK work as advertised KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that "any unblocked signal received [...] will cause KVM_RUN to return with -EINTR" and that "the signal will only be delivered if not blocked by the original signal mask". This, however, is only true, when the calling task has a signal handler registered for a signal. If not, signal evaluation is short-circuited for SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN returning or the whole process is terminated. Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar to that in do_sigtimedwait() to avoid short-circuiting of signals. Signed-off-by: Jan H. SchÃ¶nherr <jschoenh@amazon.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-11-27 17:53:47 +01:00
Prateek Sood	1599a185f0	cpuset: Make cpuset hotplug synchronous Convert cpuset_hotplug_workfn() into synchronous call for cpu hotplug path. For memory hotplug path it still gets queued as a work item. Since cpuset_hotplug_workfn() can be made synchronous for cpu hotplug path, it is not required to wait for cpuset hotplug while thawing processes. Signed-off-by: Prateek Sood <prsood@codeaurora.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-11-27 08:48:10 -08:00
Prateek Sood	aa24163b2e	cgroup/cpuset: remove circular dependency deadlock Remove circular dependency deadlock in a scenario where hotplug of CPU is being done while there is updation in cgroup and cpuset triggered from userspace. Process A => kthreadd => Process B => Process C => Process A Process A cpu_subsys_offline(); cpu_down(); _cpu_down(); percpu_down_write(&cpu_hotplug_lock); //held cpuhp_invoke_callback(); workqueue_offline_cpu(); queue_work_on(); // unbind_work on system_highpri_wq __queue_work(); insert_work(); wake_up_worker(); flush_work(); wait_for_completion(); worker_thread(); manage_workers(); create_worker(); kthread_create_on_node(); wake_up_process(kthreadd_task); kthreadd kthreadd(); kernel_thread(); do_fork(); copy_process(); percpu_down_read(&cgroup_threadgroup_rwsem); __rwsem_down_read_failed_common(); //waiting Process B kernfs_fop_write(); cgroup_file_write(); cgroup_procs_write(); percpu_down_write(&cgroup_threadgroup_rwsem); //held cgroup_attach_task(); cgroup_migrate(); cgroup_migrate_execute(); cpuset_can_attach(); mutex_lock(&cpuset_mutex); //waiting Process C kernfs_fop_write(); cgroup_file_write(); cpuset_write_resmask(); mutex_lock(&cpuset_mutex); //held update_cpumask(); update_cpumasks_hier(); rebuild_sched_domains_locked(); get_online_cpus(); percpu_down_read(&cpu_hotplug_lock); //waiting Eliminating deadlock by reversing the locking order for cpuset_mutex and cpu_hotplug_lock. Signed-off-by: Prateek Sood <prsood@codeaurora.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-11-27 08:48:10 -08:00
oder_chiou@realtek.com	346cccf883	ASoC: rt5514: Add the sanity check for the driver_data in the resume function If the rt5514 spi driver is loaded, but the snd_soc_platform_driver is not loaded by the correct DAI settings, the NULL pointer will be gotten by snd_soc_platform_get_drvdata in the resume function. Signed-off-by: Oder Chiou <oder_chiou@realtek.com> Signed-off-by: Mark Brown <broonie@kernel.org>	2017-11-27 16:44:57 +00:00
Maciej S. Szmigiero	b880b8056b	ASoC: fsl_ssi: serialize AC'97 register access operations AC'97 register access operations (both read and write) on SSI use a one, shared set of SSI registers for AC'97 register address and data. This means that only one such access is possible at a time and so all these operations need to be serialized. Since an AC'97 register access operation in this driver takes 100us+ let's use a mutex for this. Use this opportunity to also change a default value returned from AC'97 register read function from -1 to 0, since that's what AC'97 specs require to be returned when unknown / undefined registers are read. Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Signed-off-by: Mark Brown <broonie@kernel.org>	2017-11-27 16:43:43 +00:00
Maciej S. Szmigiero	695b78b548	ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure AC'97 ops (register read / write) need SSI regmap and clock, so they have to be set after them. We also need to set these ops back to NULL if we fail the probe. Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Acked-by: Nicolin Chen <nicoleotsuka@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org	2017-11-27 16:41:55 +00:00
Liu Bo	ebb70442cd	Btrfs: fix list_add corruption and soft lockups in fsync Xfstests btrfs/146 revealed this corruption, [ 58.138831] Buffer I/O error on dev dm-0, logical block 2621424, async page read [ 58.151233] BTRFS error (device sdf): bdev /dev/mapper/error-test errs: wr 1, rd 0, flush 0, corrupt 0, gen 0 [ 58.152403] list_add corruption. prev->next should be next (ffff88005e6775d8), but was ffffc9000189be88. (prev=ffffc9000189be88). [ 58.153518] ------------[ cut here ]------------ [ 58.153892] WARNING: CPU: 1 PID: 1287 at lib/list_debug.c:31 __list_add_valid+0x169/0x1f0 ... [ 58.157379] RIP: 0010:__list_add_valid+0x169/0x1f0 ... [ 58.161956] Call Trace: [ 58.162264] btrfs_log_inode_parent+0x5bd/0xfb0 [btrfs] [ 58.163583] btrfs_log_dentry_safe+0x60/0x80 [btrfs] [ 58.164003] btrfs_sync_file+0x4c2/0x6f0 [btrfs] [ 58.164393] vfs_fsync_range+0x5f/0xd0 [ 58.164898] do_fsync+0x5a/0x90 [ 58.165170] SyS_fsync+0x10/0x20 [ 58.165395] entry_SYSCALL_64_fastpath+0x1f/0xbe ... It turns out that we could record btrfs_log_ctx:io_err in log_one_extents when IO fails, but make log_one_extents() return '0' instead of -EIO, so the IO error is not acknowledged by the callers, i.e. btrfs_log_inode_parent(), which would remove btrfs_log_ctx:list from list head 'root->log_ctxs'. Since btrfs_log_ctx is allocated from stack memory, it'd get freed with a object alive on the list. then a future list_add will throw the above warning. This returns the correct error in the above case. Jeff also reported this while testing against his fsync error patch set[1]. [1]: https://www.spinics.net/lists/linux-btrfs/msg65308.html "btrfs list corruption and soft lockups while testing writeback error handling" Fixes: `8407f55326` ("Btrfs: fix data corruption after fast fsync and writeback error") Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>	2017-11-27 17:41:19 +01:00
Wanpeng Li	b74558259c	KVM: VMX: Fix vmx->nested freeing when no SMI handler Reported by syzkaller: ------------[ cut here ]------------ WARNING: CPU: 5 PID: 2939 at arch/x86/kvm/vmx.c:3844 free_loaded_vmcs+0x77/0x80 [kvm_intel] CPU: 5 PID: 2939 Comm: repro Not tainted 4.14.0+ #26 RIP: 0010:free_loaded_vmcs+0x77/0x80 [kvm_intel] Call Trace: vmx_free_vcpu+0xda/0x130 [kvm_intel] kvm_arch_destroy_vm+0x192/0x290 [kvm] kvm_put_kvm+0x262/0x560 [kvm] kvm_vm_release+0x2c/0x30 [kvm] __fput+0x190/0x370 task_work_run+0xa1/0xd0 do_exit+0x4d2/0x13e0 do_group_exit+0x89/0x140 get_signal+0x318/0xb80 do_signal+0x8c/0xb40 exit_to_usermode_loop+0xe4/0x140 syscall_return_slowpath+0x206/0x230 entry_SYSCALL_64_fastpath+0x98/0x9a The syzkaller testcase will execute VMXON/VMLAUCH instructions, so the vmx->nested stuff is populated, it will also issue KVM_SMI ioctl. However, the testcase is just a simple c program and not be lauched by something like seabios which implements smi_handler. Commit `05cade71cf` (KVM: nSVM: fix SMI injection in guest mode) gets out of guest mode and set nested.vmxon to false for the duration of SMM according to SDM 34.14.1 "leave VMX operation" upon entering SMM. We can't alloc/free the vmx->nested stuff each time when entering/exiting SMM since it will induce more overhead. So the function vmx_pre_enter_smm() marks nested.vmxon false even if vmx->nested stuff is still populated. What it expected is em_rsm() can mark nested.vmxon to be true again. However, the smi_handler/rsm will not execute since there is no something like seabios in this scenario. The function free_nested() fails to free the vmx->nested stuff since the vmx->nested.vmxon is false which results in the above warning. This patch fixes it by also considering the no SMI handler case, luckily vmx->nested.smm.vmxon is marked according to the value of vmx->nested.vmxon in vmx_pre_enter_smm(), we can take advantage of it and free vmx->nested stuff when L1 goes down. Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Liran Alon <liran.alon@oracle.com> Fixes: `05cade71cf` (KVM: nSVM: fix SMI injection in guest mode) Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-11-27 17:37:55 +01:00
Wanpeng Li	c37c28730b	KVM: VMX: Fix rflags cache during vCPU reset Reported by syzkaller: * Guest State * CR0: actual=0x0000000080010031, shadow=0x0000000060000010, gh_mask=fffffffffffffff7 CR4: actual=0x0000000000002061, shadow=0x0000000000000000, gh_mask=ffffffffffffe8f1 CR3 = 0x000000002081e000 RSP = 0x000000000000fffa RIP = 0x0000000000000000 RFLAGS=0x00023000 DR7 = 0x00000000000000 ^^^^^^^^^^ ------------[ cut here ]------------ WARNING: CPU: 6 PID: 24431 at /home/kernel/linux/arch/x86/kvm//x86.c:7302 kvm_arch_vcpu_ioctl_run+0x651/0x2ea0 [kvm] CPU: 6 PID: 24431 Comm: reprotest Tainted: G W OE 4.14.0+ #26 RIP: 0010:kvm_arch_vcpu_ioctl_run+0x651/0x2ea0 [kvm] RSP: 0018:ffff880291d179e0 EFLAGS: 00010202 Call Trace: kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 entry_SYSCALL_64_fastpath+0x23/0x9a The failed vmentry is triggered by the following beautified testcase: #include <unistd.h> #include <sys/syscall.h> #include <string.h> #include <stdint.h> #include <linux/kvm.h> #include <fcntl.h> #include <sys/ioctl.h> long r[5]; int main() { struct kvm_debugregs dr = { 0 }; r[2] = open("/dev/kvm", O_RDONLY); r[3] = ioctl(r[2], KVM_CREATE_VM, 0); r[4] = ioctl(r[3], KVM_CREATE_VCPU, 7); struct kvm_guest_debug debug = { .control = 0xf0403, .arch = { .debugreg[6] = 0x2, .debugreg[7] = 0x2 } }; ioctl(r[4], KVM_SET_GUEST_DEBUG, &debug); ioctl(r[4], KVM_RUN, 0); } which testcase tries to setup the processor specific debug registers and configure vCPU for handling guest debug events through KVM_SET_GUEST_DEBUG. The KVM_SET_GUEST_DEBUG ioctl will get and set rflags in order to set TF bit if single step is needed. All regs' caches are reset to avail and GUEST_RFLAGS vmcs field is reset to 0x2 during vCPU reset. However, the cache of rflags is not reset during vCPU reset. The function vmx_get_rflags() returns an unreset rflags cache value since the cache is marked avail, it is 0 after boot. Vmentry fails if the rflags reserved bit 1 is 0. This patch fixes it by resetting both the GUEST_RFLAGS vmcs field and its cache to 0x2 during vCPU reset. Reported-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-11-27 17:37:46 +01:00
Wanpeng Li	e70b57a6ce	KVM: X86: Fix softlockup when get the current kvmclock watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [qemu-system-x86:10185] CPU: 6 PID: 10185 Comm: qemu-system-x86 Tainted: G OE 4.14.0-rc4+ #4 RIP: 0010:kvm_get_time_scale+0x4e/0xa0 [kvm] Call Trace: get_time_ref_counter+0x5a/0x80 [kvm] kvm_hv_process_stimers+0x120/0x5f0 [kvm] kvm_arch_vcpu_ioctl_run+0x4b4/0x1690 [kvm] kvm_vcpu_ioctl+0x33a/0x620 [kvm] do_vfs_ioctl+0xa1/0x5d0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x1e/0xa9 This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and cpu-hotplug stress simultaneously. __this_cpu_read(cpu_tsc_khz) returns 0 (set in kvmclock_cpu_down_prep()) when the pCPU is unhotplug which results in kvm_get_time_scale() gets into an infinite loop. This patch fixes it by treating the unhotplug pCPU as not using master clock. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-11-27 17:32:53 +01:00

... 48 49 50 51 52 ...

724454 Commits