Commit Graph

4982 Commits

Author SHA1 Message Date
Saravana Kannan
d46f3e3ed5 driver core: Improve fw_devlink & deferred_probe_timeout interaction
deferred_probe_timeout kernel commandline parameter allows probing of
consumer devices if the supplier devices don't have any drivers.

fw_devlink=on will indefintely block probe() calls on a device if all
its suppliers haven't probed successfully. This completely skips calls
to driver_deferred_probe_check_state() since that's only called when a
.probe() function calls framework APIs. So fw_devlink=on breaks
deferred_probe_timeout.

deferred_probe_timeout in its current state also ignores a lot of
information that's now available to the kernel. It assumes all suppliers
that haven't probed when the timer expires (or when initcalls are done
on a static kernel) will never probe and fails any calls to acquire
resources from these unprobed suppliers.

However, this assumption by deferred_probe_timeout isn't true under many
conditions. For example:
- If the consumer happens to be before the supplier in the deferred
  probe list.
- If the supplier itself is waiting on its supplier to probe.

This patch fixes both these issues by relaxing device links between
devices only if the supplier doesn't have any driver that could match
with (NOT bound to) the supplier device. This way, we only fail attempts
to acquire resources from suppliers that truly don't have any driver vs
suppliers that just happen to not have probed yet.

Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210402040342.2944858-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-05 09:17:56 +02:00
Greg Kroah-Hartman
b20e829390 Merge 5.12-rc6 into driver-core-next
We need the driver core fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-05 08:51:37 +02:00
Linus Torvalds
f5664825fc Driver core fix for 5.12-rc6
Here is a single driver core fix for a reported problem with differed
 probing.  It has been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYGhGpg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymgSACeJYD1EfjFSCB0gEmdjU261rWOcT0AoLTRkJW1
 vywQNi5XNrbbmsWpxLsF
 =wG2r
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core fix from Greg KH:
 "Here is a single driver core fix for a reported problem with differed
  probing. It has been in linux-next for a while with no reported
  problems"

* tag 'driver-core-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  driver core: clear deferred probe reason on probe retry
2021-04-03 10:14:47 -07:00
Andy Shevchenko
ed7027fdf4 driver core: platform: Make platform_get_irq_optional() optional
Currently the platform_get_irq_optional() returns an error code even
if IRQ resource sumply has not been found. It prevents caller to be
error code agnostic in their error handling.

Now:
	ret = platform_get_irq_optional(...);
	if (ret != -ENXIO)
		return ret; // respect deferred probe
	if (ret > 0)
		...we get an IRQ...

After proposed change:
	ret = platform_get_irq_optional(...);
	if (ret < 0)
		return ret;
	if (ret > 0)
		...we get an IRQ...

Reported-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210331144526.19439-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-02 17:06:34 +02:00
Andy Shevchenko
318c3e00f1 driver core: Replace printf() specifier and drop unneeded casting
The size_t type has very well established specifier, i.e. "%zu",
use it directly instead of casting to unsigned long with "%lu".

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210401171042.60612-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-02 17:02:45 +02:00
Andy Shevchenko
d7aa44f5a1 driver core: Cast to (void *) with __force for __percpu pointer
Sparse is not happy:

  drivers/base/devres.c:1230:9: warning: cast removes address space '__percpu' of expression

Use __force attribute to make it happy.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210401171030.60527-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-02 17:02:39 +02:00
Andy Shevchenko
c99f4ebc68 driver core: platform: Make clear error code used for missed IRQ
We have few code paths where same error code is assigned and
returned for missed IRQ. Unify that under single error path.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210331145937.35980-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-02 17:02:22 +02:00
Pierre-Louis Bossart
cc71079023 devcoredump: fix kernel-doc warning
remove make W=1 warnings

drivers/base/devcoredump.c:208: warning:
Function parameter or member 'data' not described in
'devcd_free_sgtable'

drivers/base/devcoredump.c:208: warning:
Excess function parameter 'table' description in 'devcd_free_sgtable'

drivers/base/devcoredump.c:225: warning:
expecting prototype for devcd_read_from_table(). Prototype was for
devcd_read_from_sgtable() instead

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210331232614.304591-8-pierre-louis.bossart@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-02 16:40:08 +02:00
Pierre-Louis Bossart
3c652132ce platform-msi: fix kernel-doc warnings
remove make W=1 warnings

drivers/base/platform-msi.c:336: warning:
Function parameter or member 'is_tree' not described in
'__platform_msi_create_device_domain'

drivers/base/platform-msi.c:336: warning:
expecting prototype for platform_msi_create_device_domain(). Prototype
was for __platform_msi_create_device_domain() instead

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210331232614.304591-7-pierre-louis.bossart@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-02 16:40:08 +02:00
Pierre-Louis Bossart
f4651a7dd6 driver core: attribute_container: remove kernel-doc warnings
Remove make W=1 warnings

drivers/base/attribute_container.c:471: warning: Function parameter or
member 'cont' not described in
'attribute_container_add_class_device_adapter'

drivers/base/attribute_container.c:471: warning: Function parameter or
member 'dev' not described in
'attribute_container_add_class_device_adapter'

drivers/base/attribute_container.c:471: warning: Function parameter or
member 'classdev' not described in
'attribute_container_add_class_device_adapter'

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210331232614.304591-3-pierre-louis.bossart@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-02 16:40:07 +02:00
Pierre-Louis Bossart
37c52f7403 driver core: remove kernel-doc warnings
remove make W=1 warning:

drivers/base/core.c:1670: warning: Function parameter or member
'flags' not described in 'fw_devlink_create_devlink'

drivers/base/core.c:1670: warning: Function parameter or member 'con'
not described in 'fw_devlink_create_devlink'

drivers/base/core.c:1670: warning: Function parameter or member
'sup_handle' not described in 'fw_devlink_create_devlink'

drivers/base/core.c:1670: warning: Function parameter or member
'flags' not described in 'fw_devlink_create_devlink'

drivers/base/core.c:1763: warning: Function parameter or member 'dev'
not described in '__fw_devlink_link_to_consumers'

drivers/base/core.c:1844: warning: Function parameter or member 'dev'
not described in '__fw_devlink_link_to_suppliers'

drivers/base/core.c:1844: warning: Function parameter or member
'fwnode' not described in '__fw_devlink_link_to_suppliers'

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20210331232614.304591-2-pierre-louis.bossart@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-02 16:40:07 +02:00
Adrian Hunter
9dfacc54a8 PM: runtime: Fix race getting/putting suppliers at probe
pm_runtime_put_suppliers() must not decrement rpm_active unless the
consumer is suspended. That is because, otherwise, it could suspend
suppliers for an active consumer.

That can happen as follows:

 static int driver_probe_device(struct device_driver *drv, struct device *dev)
 {
	int ret = 0;

	if (!device_is_registered(dev))
		return -ENODEV;

	dev->can_match = true;
	pr_debug("bus: '%s': %s: matched device %s with driver %s\n",
		 drv->bus->name, __func__, dev_name(dev), drv->name);

	pm_runtime_get_suppliers(dev);
	if (dev->parent)
		pm_runtime_get_sync(dev->parent);

 At this point, dev can runtime suspend so rpm_put_suppliers() can run,
 rpm_active becomes 1 (the lowest value).

	pm_runtime_barrier(dev);
	if (initcall_debug)
		ret = really_probe_debug(dev, drv);
	else
		ret = really_probe(dev, drv);

 Probe callback can have runtime resumed dev, and then runtime put
 so dev is awaiting autosuspend, but rpm_active is 2.

	pm_request_idle(dev);

	if (dev->parent)
		pm_runtime_put(dev->parent);

	pm_runtime_put_suppliers(dev);

 Now pm_runtime_put_suppliers() will put the supplier
 i.e. rpm_active 2 -> 1, but consumer can still be active.

	return ret;
 }

Fix by checking the runtime status. For any status other than
RPM_SUSPENDED, rpm_active can be considered to be "owned" by
rpm_[get/put]_suppliers() and pm_runtime_put_suppliers() need do nothing.

Reported-by: Asutosh Das <asutoshd@codeaurora.org>
Fixes: 4c06c4e6cf ("driver core: Fix possible supplier PM-usage counter imbalance")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-29 19:27:09 +02:00
Adrian Hunter
c0c33442f7 PM: runtime: Fix ordering in pm_runtime_get_suppliers()
rpm_active indicates how many times the supplier usage_count has been
incremented. Consequently it must be updated after pm_runtime_get_sync() of
the supplier, not before.

Fixes: 4c06c4e6cf ("driver core: Fix possible supplier PM-usage counter imbalance")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-29 19:27:09 +02:00
Jia-Ju Bai
d225ef6fda base: dd: fix error return code of driver_sysfs_add()
When device_create_file() fails and returns a non-zero value,
no error return code of driver_sysfs_add() is assigned.
To fix this bug, ret is assigned with the return value of
device_create_file(), and then ret is checked.

Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Link: https://lore.kernel.org/r/20210324023405.12465-1-baijiaju1990@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-28 14:55:39 +02:00
Yogesh Lal
e611f8cd87 driver core: Use unbound workqueue for deferred probes
Deferred probe usually runs only on pinned kworkers, which might take
longer time if a device contains multiple sub-devices. One such case
is of sound card on mobile devices, where we have good number of
mixers and controls per mixer.

We observed boot up improvement - deferred probes take ~600ms when bound
to little core kworker and ~200ms when deferred probe is queued on
unbound wq. This is due to scheduler moving the worker running deferred
probe work to big CPUs. Without this change, we see the worker is running
on LITTLE CPU due to affinity.

Since kworker runs deferred probe of several devices, the locality may
not be important. Also, init thread executing driver initcalls, can
potentially migrate as it has cpu affinity set to all cpus.In addition
to this, async probes use unbounded workqueue. So, using unbounded wq for
deferred probes looks to be similar to these w.r.t. scheduling behavior.

Signed-off-by: Yogesh Lal <ylal@codeaurora.org>
Link: https://lore.kernel.org/r/1616583698-6398-1-git-send-email-ylal@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-28 14:55:39 +02:00
Ahmad Fatoum
f0acf637d6 driver core: clear deferred probe reason on probe retry
When retrying a deferred probe, any old defer reason string should be
discarded. Otherwise, if the probe is deferred again at a different spot,
but without setting a message, the now incorrect probe reason will remain.

This was observed with the i.MX I2C driver, which ultimately failed
to probe due to lack of the GPIO driver. The probe defer for GPIO
doesn't record a message, but a previous probe defer to clock_get did.
This had the effect that /sys/kernel/debug/devices_deferred listed
a misleading probe deferral reason.

Cc: stable <stable@vger.kernel.org>
Fixes: d090b70ede ("driver core: add deferring probe reason to devices_deferred property")
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Andrzej Hajda <a.hajda@samsung.com>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Link: https://lore.kernel.org/r/20210319110459.19966-1-a.fatoum@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 15:13:43 +01:00
Arnd Bergmann
53f95c5534 devcoredump: avoid -Wempty-body warnings
Cleaning out the last -Wempty-body warnings found some interesting
cases with empty macros, along with harmless warnings like this one:

drivers/base/devcoredump.c: In function 'dev_coredumpm':
drivers/base/devcoredump.c:297:56: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  297 |                 /* nothing - symlink will be missing */;
      |                                                        ^
drivers/base/devcoredump.c:301:56: error: suggest braces around empty body in an 'if' statement [-Werror=empty-body]
  301 |                 /* nothing - symlink will be missing */;
      |                                                        ^

Randy tried addressing this one before, and there were multiple
other ideas in that thread.

Add a runtime warning and code comment here.

Link: https://lore.kernel.org/lkml/20200418184111.13401-8-rdunlap@infradead.org/
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20210322114258.3420937-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 15:04:59 +01:00
Andy Shevchenko
7f2fac70b7 device property: Add test cases for fwnode_property_count_*() APIs
Add test cases for fwnode_property_count_*() APIs.

While at it, modify the arrays of integers to be size of non-power-of-2
for better test coverage and decreasing stack usage.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210212162539.86850-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 15:04:06 +01:00
Andy Shevchenko
0b8bf06f67 device property: Sync descriptions of swnode array and group APIs
After a few updates against swnode APIs the kernel documentation, i.e.
for swnode group registration and unregistration deviates from the one
for swnode array. In general, the same rules are applied to both.
Hence, synchronize descriptions of swnode array and group APIs

Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20210308103644.81960-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 15:04:01 +01:00
Saravana Kannan
ea718c6990 Revert "Revert "driver core: Set fw_devlink=on by default""
This reverts commit 3e4c982f1c.

Since all reported issues due to fw_devlink=on should be addressed by
this series, revert the revert. fw_devlink=on Take II.

Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210302211133.2244281-4-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 14:58:11 +01:00
Saravana Kannan
b6f617df4f driver core: Update device link status properly for device_bind_driver()
Device link status was not getting updated correctly when
device_bind_driver() is called on a device. This causes a warning[1].
Fix this by updating device links that can be updated and dropping
device links that can't be updated to a sensible state.

[1] - https://lore.kernel.org/lkml/56f7d032-ba5a-a8c7-23de-2969d98c527e@nvidia.com/

Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210302211133.2244281-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 14:58:11 +01:00
Saravana Kannan
f2db85b64f driver core: Avoid pointless deferred probe attempts
There's no point in adding a device to the deferred probe list if we
know for sure that it doesn't have a matching driver. So, check if a
device can match with a driver before adding it to the deferred probe
list.

Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210302211133.2244281-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 14:58:10 +01:00
Rasmus Villemoes
01085e24ff devtmpfs: actually reclaim some init memory
Currently gcc seems to inline devtmpfs_setup() into devtmpfsd(), so
its memory footprint isn't reclaimed as intended. Mark it noinline to
make sure it gets put in .init.text.

While here, setup_done can also be put in .init.data: After complete()
releases the internal spinlock, the completion object is never touched
again by that thread, and the waiting thread doesn't proceed until it
observes ->done while holding that spinlock.

This is now the same pattern as for kthreadd_done in init/main.c:
complete() is done in a __ref function, while the corresponding
wait_for_completion() is in an __init function.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lore.kernel.org/r/20210312103027.2701413-2-linux@rasmusvillemoes.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 14:57:35 +01:00
Rasmus Villemoes
38f087de89 devtmpfs: fix placement of complete() call
Calling complete() from within the __init function is wrong -
theoretically, the init process could proceed all the way to freeing
the init mem before the devtmpfsd thread gets to execute the return
instruction in devtmpfs_setup().

In practice, it seems to be harmless as gcc inlines devtmpfs_setup()
into devtmpfsd(). So the calls of the __init functions init_chdir()
etc. actually happen from devtmpfs_setup(), but the __ref on that one
silences modpost (it's all right, because those calls happen before
the complete()). But it does make the __init annotation of the setup
function moot, which we'll fix in a subsequent patch.

Fixes: bcbacc4909 ("devtmpfs: refactor devtmpfsd()")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lore.kernel.org/r/20210312103027.2701413-1-linux@rasmusvillemoes.dk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 14:57:35 +01:00
Colin Ian King
6b72cf1282 drivers/base/cpu: remove redundant assignment of variable retval
The variable retval is being initialized with a value that is never read
and it is being updated later with a new value.  Clean this up by
initializing retval to -ENOMEM and remove the assignment to retval
on the !dev failure path.

Kudos to Rafael for the improved fix suggestion.

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210218202837.516231-1-colin.king@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 14:56:50 +01:00
Greg Kroah-Hartman
2942df6751 driver core: dd: remove deferred_devices variable
No need to save the debugfs dentry for the "devices_deferred" debugfs
file (gotta love the juxtaposition), if we need to remove it we can look
it up from debugfs itself.

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20210216142400.3759099-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 10:49:04 +01:00
Greg Kroah-Hartman
c654cea59d driver core: component: remove dentry pointer in "struct master"
There is no need to keep around a pointer to a dentry when all it is
used for is to remove the debugfs file when tearing things down.  As the
name is simple, have debugfs look up the dentry when removing things,
keeping the logic much simpler.

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20210216142400.3759099-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 10:49:02 +01:00
Dave Jiang
bbf44abeea driver core: auxiliary bus: Remove unneeded module bits
Remove module bits in the auxiliary bus code since the auxiliary bus
cannot be built as a module and the relevant code is not needed.

Cc: Dave Ertman <david.m.ertman@intel.com>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/161307488980.1896017.15627190714413338196.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-23 10:47:55 +01:00
Rafael J. Wysocki
5244f5e2d8 PM: runtime: Defer suspending suppliers
Because the PM-runtime status of the device is not updated in
__rpm_callback(), attempts to suspend the suppliers of the given
device triggered by the rpm_put_suppliers() call in there may
cause a supplier to be suspended completely before the status of
the consumer is updated to RPM_SUSPENDED, which is confusing.

To avoid that (1) modify __rpm_callback() to only decrease the
PM-runtime usage counter of each supplier and (2) make rpm_suspend()
try to suspend the suppliers after changing the consumer's status to
RPM_SUSPENDED, in analogy with the device's parent.

Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/
Fixes: 21d5c57b37 ("PM / runtime: Use device links")
Reported-by: elaine.zhang <zhangqing@rock-chips.com>
Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-22 15:21:38 +01:00
Rafael J. Wysocki
0cab893f40 Revert "PM: runtime: Update device status before letting suppliers suspend"
Revert commit 44cc89f764 ("PM: runtime: Update device status
before letting suppliers suspend") that introduced a race condition
into __rpm_callback() which allowed a concurrent rpm_resume() to
run and resume the device prematurely after its status had been
changed to RPM_SUSPENDED by __rpm_callback().

Fixes: 44cc89f764 ("PM: runtime: Update device status before letting suppliers suspend")
Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-03-19 16:35:47 +01:00
Heikki Krogerus
2a92c90f2e software node: Fix device_add_software_node()
The function device_add_software_node() was meant to
register the node supplied to it, but only if that node
wasn't already registered. Right now the function attempts
to always register the node. That will cause a failure with
nodes that are already registered.

Fixing that by incrementing the reference count of the nodes
that have already been registered, and only registering the
new nodes. Also, clarifying the behaviour in the function
documentation.

Fixes: e68d0119e3 ("software node: Introduce device_add_software_node()")
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-10 15:25:02 +01:00
Heikki Krogerus
8891123f9c software node: Fix node registration
Software node can not be registered before its parent.

Fixes: 80488a6b1d ("software node: Add support for static node descriptors")
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-10 15:19:58 +01:00
Rafael J. Wysocki
44cc89f764 PM: runtime: Update device status before letting suppliers suspend
Because the PM-runtime status of the device is not updated in
__rpm_callback(), attempts to suspend the suppliers of the given
device triggered by rpm_put_suppliers() called by it may fail.

Fix this by making __rpm_callback() update the device's status to
RPM_SUSPENDED before calling rpm_put_suppliers() if the current
status of the device is RPM_SUSPENDING and the callback just invoked
by it has returned 0 (success).

While at it, modify the code in __rpm_callback() to always check
the device's PM-runtime status under its PM lock.

Link: https://lore.kernel.org/linux-pm/CAPDyKFqm06KDw_p8WXsM4dijDbho4bb6T4k50UqqvR1_COsp8g@mail.gmail.com/
Fixes: 21d5c57b37 ("PM / runtime: Use device links")
Reported-by: Elaine Zhang <zhangqing@rock-chips.com>
Diagnosed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Elaine Zhang <zhangiqng@rock-chips.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-03-01 17:40:38 +01:00
Linus Torvalds
8b83369ddc RISC-V Patches for the 5.12 Merge Window
I have a handful of new RISC-V related patches for this merge window:
 
 * A check to ensure drivers are properly using uaccess.  This isn't
   manifesting with any of the drivers I'm currently using, but may catch
   errors in new drivers.
 * Some preliminary support for the FU740, along with the HiFive
   Unleashed it will appear on.
 * NUMA support for RISC-V, which involves making the arm64 code generic.
 * Support for kasan on the vmalloc region.
 * A handful of new drivers for the Kendryte K210, along with the DT
   plumbing required to boot on a handful of K210-based boards.
 * Support for allocating ASIDs.
 * Preliminary support for kernels larger than 128MiB.
 * Various other improvements to our KASAN support, including the
   utilization of huge pages when allocating the KASAN regions.
 
 We may have already found a bug with the KASAN_VMALLOC code, but it's
 passing my tests.  There's a fix in the works, but that will probably
 miss the merge window.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmA4hXATHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYifryD/0SfXGOfj93Cxq7I7AYhhzCN7lJ5jvv
 iEQScTlPqU9nfvYodo4EDq0fp+5LIPpTL/XBHtqVjzv0FqRNa28Ea0K7kO8HuXc4
 BaUd0m/DqyB4Gfgm4qjc5bDneQ1ZYxVXprYERWNQ5Fj+tdWhaQGOW64N/TVodjjj
 NgJtTqbIAcjJqjUtttM8TZN5U1TgwLo+KCqw3iYW12lV1YKBBuvrwvSdD6jnFdIQ
 AzG/wRGZhxLoFxgBB/NEsZxDoSd6ztiwxLhS9lX4okZVsryyIdOE70Q/MflfiTlU
 xE+AdxQXTMUiiqYSmHeDD6PDb57GT/K3hnjI1yP+lIZpbInsi29JKow1qjyYjfHl
 9cSSKYCIXHL7jKU6pgt34G1O5N5+fgqHQhNbfKvlrQ2UPlfs/tWdKHpFIP/z9Jlr
 0vCAou7NSEB9zZGqzO63uBLXoN8yfL8FT3uRnnRvoRpfpex5dQX2QqPLQ7327D7N
 GUG31nd1PHTJPdxJ1cI4SO24PqPpWDWY9uaea+0jv7ivGClVadZPco/S3ZKloguT
 lazYUvyA4oRrSAyln785Rd8vg4CinqTxMtIyZbRMbNkgzVQARi9a8rjvu4n9qms2
 2wlXDFi8nR8B4ih5n79dSiiLM9ay9GJDxMcf9VxIxSAYZV2fJALnpK6gV2fzRBUe
 +k/uv8BIsFmlwQ==
 =CutX
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:
 "A handful of new RISC-V related patches for this merge window:

   - A check to ensure drivers are properly using uaccess. This isn't
     manifesting with any of the drivers I'm currently using, but may
     catch errors in new drivers.

   - Some preliminary support for the FU740, along with the HiFive
     Unleashed it will appear on.

   - NUMA support for RISC-V, which involves making the arm64 code
     generic.

   - Support for kasan on the vmalloc region.

   - A handful of new drivers for the Kendryte K210, along with the DT
     plumbing required to boot on a handful of K210-based boards.

   - Support for allocating ASIDs.

   - Preliminary support for kernels larger than 128MiB.

   - Various other improvements to our KASAN support, including the
     utilization of huge pages when allocating the KASAN regions.

  We may have already found a bug with the KASAN_VMALLOC code, but it's
  passing my tests. There's a fix in the works, but that will probably
  miss the merge window.

* tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (75 commits)
  riscv: Improve kasan population by using hugepages when possible
  riscv: Improve kasan population function
  riscv: Use KASAN_SHADOW_INIT define for kasan memory initialization
  riscv: Improve kasan definitions
  riscv: Get rid of MAX_EARLY_MAPPING_SIZE
  soc: canaan: Sort the Makefile alphabetically
  riscv: Disable KSAN_SANITIZE for vDSO
  riscv: Remove unnecessary declaration
  riscv: Add Canaan Kendryte K210 SD card defconfig
  riscv: Update Canaan Kendryte K210 defconfig
  riscv: Add Kendryte KD233 board device tree
  riscv: Add SiPeed MAIXDUINO board device tree
  riscv: Add SiPeed MAIX GO board device tree
  riscv: Add SiPeed MAIX DOCK board device tree
  riscv: Add SiPeed MAIX BiT board device tree
  riscv: Update Canaan Kendryte K210 device tree
  dt-bindings: add resets property to dw-apb-timer
  dt-bindings: fix sifive gpio properties
  dt-bindings: update sifive uart compatible string
  dt-bindings: update sifive clint compatible string
  ...
2021-02-26 10:28:35 -08:00
David Hildenbrand
e9a2e48e87 drivers/base/memory: don't store phys_device in memory blocks
No need to store the value for each and every memory block, as we can
easily query the value at runtime.  Reshuffle the members to optimize the
memory layout.  Also, let's clarify what the interface once was used for
and why it's legacy nowadays.

"phys_device" was used on s390x in older versions of lsmem[2]/chmem[3],
back when they were still part of s390x-tools.  They were later replaced
by the variants in linux-utils.  For example, RHEL6 and RHEL7 contain
lsmem/chmem from s390-utils.  RHEL8 switched to versions from util-linux
on s390x [4].

"phys_device" was added with sysfs support for memory hotplug in commit
3947be1969 ("[PATCH] memory hotplug: sysfs and add/remove functions") in
2005.  It always returned 0.

s390x started returning something != 0 on some setups (if sclp.rzm is set
by HW) in 2010 via commit 57b552ba0b ("memory hotplug/s390: set
phys_device").

For s390x, it allowed for identifying which memory block devices belong to
the same storage increment (RZM).  Only if all memory block devices
comprising a single storage increment were offline, the memory could
actually be removed in the hypervisor.

Since commit e5d709bb5f ("s390/memory hotplug: provide
memory_block_size_bytes() function") in 2013 a memory block device spans
at least one storage increment - which is why the interface isn't really
helpful/used anymore (except by old lsmem/chmem tools).

There were once RFC patches to make use of "phys_device" in ACPI context;
however, the underlying problem could be solved using different interfaces
[1].

[1] https://patchwork.kernel.org/patch/2163871/
[2] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/lsmem
[3] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/chmem
[4] https://bugzilla.redhat.com/show_bug.cgi?id=1504134

Link: https://lkml.kernel.org/r/20210201181347.13262-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Vaibhav Jain <vaibhav@linux.ibm.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:00 -08:00
Anshuman Khandual
1adf8b468f mm/memory_hotplug: rename all existing 'memhp' into 'mhp'
This renames all 'memhp' instances to 'mhp' except for memhp_default_state
for being a kernel command line option.  This is just a clean up and
should not cause a functional change.  Let's make it consistent rater than
mixing the two prefixes.  In preparation for more users of the 'mhp'
terminology.

Link: https://lkml.kernel.org/r/1611554093-27316-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:00 -08:00
Linus Torvalds
4c48faba5b Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "A few small subsystems and some of MM.

  172 patches.

  Subsystems affected by this patch series: hexagon, scripts, ntfs,
  ocfs2, vfs, and mm (slab-generic, slab, slub, debug, pagecache, swap,
  memcg, pagemap, mprotect, mremap, page-reporting, vmalloc, kasan,
  pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction,
  mempolicy, oom-kill, hugetlbfs, and migration)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (172 commits)
  mm/migrate: remove unneeded semicolons
  hugetlbfs: remove unneeded return value of hugetlb_vmtruncate()
  hugetlbfs: fix some comment typos
  hugetlbfs: correct some obsolete comments about inode i_mutex
  hugetlbfs: make hugepage size conversion more readable
  hugetlbfs: remove meaningless variable avoid_reserve
  hugetlbfs: correct obsolete function name in hugetlbfs_read_iter()
  hugetlbfs: use helper macro default_hstate in init_hugetlbfs_fs
  hugetlbfs: remove useless BUG_ON(!inode) in hugetlbfs_setattr()
  hugetlbfs: remove special hugetlbfs_set_page_dirty()
  mm/hugetlb: change hugetlb_reserve_pages() to type bool
  mm, oom: fix a comment in dump_task()
  mm/mempolicy: use helper range_in_vma() in queue_pages_test_walk()
  numa balancing: migrate on fault among multiple bound nodes
  mm, compaction: make fast_isolate_freepages() stay within zone
  mm/compaction: fix misbehaviors of fast_find_migrateblock()
  mm/compaction: correct deferral logic for proactive compaction
  mm/compaction: remove duplicated VM_BUG_ON_PAGE !PageLocked
  mm/compaction: remove rcu_read_lock during page compaction
  z3fold: simplify the zhdr initialization code in init_z3fold_page()
  ...
2021-02-24 16:20:38 -08:00
Shakeel Butt
b603894248 mm: memcg: add swapcache stat for memcg v2
This patch adds swapcache stat for the cgroup v2.  The swapcache
represents the memory that is accounted against both the memory and the
swap limit of the cgroup.  The main motivation behind exposing the
swapcache stat is for enabling users to gracefully migrate from cgroup
v1's memsw counter to cgroup v2's memory and swap counters.

Cgroup v1's memsw limit allows users to limit the memory+swap usage of a
workload but without control on the exact proportion of memory and swap.
Cgroup v2 provides separate limits for memory and swap which enables more
control on the exact usage of memory and swap individually for the
workload.

With some little subtleties, the v1's memsw limit can be switched with the
sum of the v2's memory and swap limits.  However the alternative for memsw
usage is not yet available in cgroup v2.  Exposing per-cgroup swapcache
stat enables that alternative.  Adding the memory usage and swap usage and
subtracting the swapcache will approximate the memsw usage.  This will
help in the transparent migration of the workloads depending on memsw
usage and limit to v2' memory and swap counters.

The reasons these applications are still interested in this approximate
memsw usage are: (1) these applications are not really interested in two
separate memory and swap usage metrics.  A single usage metric is more
simple to use and reason about for them.

(2) The memsw usage metric hides the underlying system's swap setup from
the applications.  Applications with multiple instances running in a
datacenter with heterogeneous systems (some have swap and some don't) will
keep seeing a consistent view of their usage.

[akpm@linux-foundation.org: fix CONFIG_SWAP=n build]

Link: https://lkml.kernel.org/r/20210108155813.2914586-3-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24 13:38:29 -08:00
Muchun Song
380780e718 mm: memcontrol: convert NR_FILE_PMDMAPPED account to pages
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_FILE_PMDMAPPED account to pages.  This patch is
consistent with 8f182270df ("mm/swap.c: flush lru pvecs on compound page
arrival").  Doing this also can make the unit of vmstat counters more
unified.  Finally, the unit of the vmstat counters are pages, kB and
bytes.  The B/KB suffix can tell us that the unit is bytes or kB.  The
rest which is without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-7-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Cc: Rafael. J. Wysocki <rafael@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24 13:38:29 -08:00
Muchun Song
a1528e21f8 mm: memcontrol: convert NR_SHMEM_PMDMAPPED account to pages
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_SHMEM_PMDMAPPED account to pages.  This patch is
consistent with 8f182270df ("mm/swap.c: flush lru pvecs on compound page
arrival").  Doing this also can make the unit of vmstat counters more
unified.  Finally, the unit of the vmstat counters are pages, kB and
bytes.  The B/KB suffix can tell us that the unit is bytes or kB.  The
rest which is without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-6-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Cc: Rafael. J. Wysocki <rafael@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24 13:38:29 -08:00
Muchun Song
57b2847d3c mm: memcontrol: convert NR_SHMEM_THPS account to pages
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_SHMEM_THPS account to pages.  This patch is
consistent with 8f182270df ("mm/swap.c: flush lru pvecs on compound page
arrival").  Doing this also can make the unit of vmstat counters more
unified.  Finally, the unit of the vmstat counters are pages, kB and
bytes.  The B/KB suffix can tell us that the unit is bytes or kB.  The
rest which is without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Cc: Rafael. J. Wysocki <rafael@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24 13:38:29 -08:00
Muchun Song
bf9ecead53 mm: memcontrol: convert NR_FILE_THPS account to pages
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with if hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_FILE_THPS account to pages.  This patch is consistent
with 8f182270df ("mm/swap.c: flush lru pvecs on compound page arrival").
Doing this also can make the unit of vmstat counters more unified.
Finally, the unit of the vmstat counters are pages, kB and bytes.  The
B/KB suffix can tell us that the unit is bytes or kB.  The rest which is
without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-4-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Cc: Rafael. J. Wysocki <rafael@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24 13:38:29 -08:00
Muchun Song
69473e5de8 mm: memcontrol: convert NR_ANON_THPS account to pages
Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters.  In
the systems with hundreds of processors it can be GBs of memory.  For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates.  But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_ANON_THPS account to pages.  This patch is consistent
with 8f182270df ("mm/swap.c: flush lru pvecs on compound page arrival").
Doing this also can make the unit of vmstat counters more unified.
Finally, the unit of the vmstat counters are pages, kB and bytes.  The
B/KB suffix can tell us that the unit is bytes or kB.  The rest which is
without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-3-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rafael. J. Wysocki <rafael@kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pankaj Gupta <pankaj.gupta@cloud.ionos.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-24 13:38:29 -08:00
Linus Torvalds
e229b429bb Char/Misc driver patches for 5.12-rc1
Here is the large set of char/misc/whatever driver subsystem updates for
 5.12-rc1.  Over time it seems like this tree is collecting more and more
 tiny driver subsystems in one place, making it easier for those
 maintainers, which is why this is getting larger.
 
 Included in here are:
 	- coresight driver updates
 	- habannalabs driver updates
 	- virtual acrn driver addition (proper acks from the x86
 	  maintainers)
 	- broadcom misc driver addition
 	- speakup driver updates
 	- soundwire driver updates
 	- fpga driver updates
 	- amba driver updates
 	- mei driver updates
 	- vfio driver updates
 	- greybus driver updates
 	- nvmeem driver updates
 	- phy driver updates
 	- mhi driver updates
 	- interconnect driver udpates
 	- fsl-mc bus driver updates
 	- random driver fix
 	- some small misc driver updates (rtsx, pvpanic, etc.)
 
 All of these have been in linux-next for a while, with the only reported
 issue being a merge conflict in include/linux/mod_devicetable.h that you
 will hit in your tree due to the dfl_device_id addition from the fpga
 subsystem in here.  The resolution should be simple.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYDZf9w8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yk3xgCcCEN+pCJTum+uAzSNH3YKs/onaDgAnRSVwOUw
 tNW6n1JhXLYl9f5JdhvS
 =MOHs
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the large set of char/misc/whatever driver subsystem updates
  for 5.12-rc1. Over time it seems like this tree is collecting more and
  more tiny driver subsystems in one place, making it easier for those
  maintainers, which is why this is getting larger.

  Included in here are:

   - coresight driver updates

   - habannalabs driver updates

   - virtual acrn driver addition (proper acks from the x86 maintainers)

   - broadcom misc driver addition

   - speakup driver updates

   - soundwire driver updates

   - fpga driver updates

   - amba driver updates

   - mei driver updates

   - vfio driver updates

   - greybus driver updates

   - nvmeem driver updates

   - phy driver updates

   - mhi driver updates

   - interconnect driver udpates

   - fsl-mc bus driver updates

   - random driver fix

   - some small misc driver updates (rtsx, pvpanic, etc.)

  All of these have been in linux-next for a while, with the only
  reported issue being a merge conflict due to the dfl_device_id
  addition from the fpga subsystem in here"

* tag 'char-misc-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (311 commits)
  spmi: spmi-pmic-arb: Fix hw_irq overflow
  Documentation: coresight: Add PID tracing description
  coresight: etm-perf: Support PID tracing for kernel at EL2
  coresight: etm-perf: Clarify comment on perf options
  ACRN: update MAINTAINERS: mailing list is subscribers-only
  regmap: sdw-mbq: use MODULE_LICENSE("GPL")
  regmap: sdw: use no_pm routines for SoundWire 1.2 MBQ
  regmap: sdw: use _no_pm functions in regmap_read/write
  soundwire: intel: fix possible crash when no device is detected
  MAINTAINERS: replace my with email with replacements
  mhi: Fix double dma free
  uapi: map_to_7segment: Update example in documentation
  uio: uio_pci_generic: don't fail probe if pdev->irq equals to IRQ_NOTCONNECTED
  drivers/misc/vmw_vmci: restrict too big queue size in qp_host_alloc_queue
  firewire: replace tricky statement by two simple ones
  vme: make remove callback return void
  firmware: google: make coreboot driver's remove callback return void
  firmware: xilinx: Use explicit values for all enum values
  sample/acrn: Introduce a sample of HSM ioctl interface usage
  virt: acrn: Introduce an interface for Service VM to control vCPU
  ...
2021-02-24 10:25:37 -08:00
Linus Torvalds
7ac1161c27 Driver core / debugfs update for 5.12-rc1
Here is the "big" driver core and debugfs update for 5.12-rc1
 
 This set of driver core patches caused a bunch of problems in linux-next
 for the past few weeks, when Saravana tried to set fw_devlink=on as the
 default functionality.  This caused a number of systems to stop booting,
 and lots of bugs were fixed in this area for almost all of the reported
 systems, but this option is not ready to be turned on just yet for the
 default operation based on this testing, so I've reverted that change at
 the very end so we don't have to worry about regressions in 5.12.  We
 will try to turn this on for 5.13 if testing goes better over the next
 few months.
 
 Other than the fixes caused by the fw_devlink testing in here, there's
 not much more:
 	- debugfs fixes for invalid input into debugfs_lookup()
 	- kerneldoc cleanups
 	- warn message if platform drivers return an error on their
 	  remove callback (a futile effort, but good to catch).
 
 All of these have been in linux-next for a while now, and the
 regressions have gone away with the revert of the fw_devlink change.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYDZhzA8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylS2wCfU28FxDWNwcWhPFVfRT8Mb3OxZ50An1sR4lNR
 t5Ie4aztMUjVJhI9bq6g
 =3NSB
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core / debugfs update from Greg KH:
 "Here is the "big" driver core and debugfs update for 5.12-rc1

  This set of driver core patches caused a bunch of problems in
  linux-next for the past few weeks, when Saravana tried to set
  fw_devlink=on as the default functionality. This caused a number of
  systems to stop booting, and lots of bugs were fixed in this area for
  almost all of the reported systems, but this option is not ready to be
  turned on just yet for the default operation based on this testing, so
  I've reverted that change at the very end so we don't have to worry
  about regressions in 5.12

  We will try to turn this on for 5.13 if testing goes better over the
  next few months.

  Other than the fixes caused by the fw_devlink testing in here, there's
  not much more:

   - debugfs fixes for invalid input into debugfs_lookup()

   - kerneldoc cleanups

   - warn message if platform drivers return an error on their remove
     callback (a futile effort, but good to catch).

  All of these have been in linux-next for a while now, and the
  regressions have gone away with the revert of the fw_devlink change"

* tag 'driver-core-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits)
  Revert "driver core: Set fw_devlink=on by default"
  of: property: fw_devlink: Ignore interrupts property for some configs
  debugfs: do not attempt to create a new file before the filesystem is initalized
  debugfs: be more robust at handling improper input in debugfs_lookup()
  driver core: auxiliary bus: Fix calling stage for auxiliary bus init
  of: irq: Fix the return value for of_irq_parse_one() stub
  of: irq: make a stub for of_irq_parse_one()
  clk: Mark fwnodes when their clock provider is added/removed
  PM: domains: Mark fwnodes when their powerdomain is added/removed
  irqdomain: Mark fwnodes when their irqdomain is added/removed
  driver core: fw_devlink: Handle suppliers that don't use driver core
  of: property: Add fw_devlink support for optional properties
  driver core: Add fw_devlink.strict kernel param
  of: property: Don't add links to absent suppliers
  driver core: fw_devlink: Detect supplier devices that will never be added
  driver core: platform: Emit a warning if a remove callback returned non-zero
  of: property: Fix fw_devlink handling of interrupts/interrupts-extended
  gpiolib: Don't probe gpio_device if it's not the primary device
  device.h: Remove bogus "the" in kerneldoc
  gpiolib: Bind gpio_device to a driver to enable fw_devlink=on by default
  ...
2021-02-24 10:13:55 -08:00
Linus Torvalds
7d6beb71da idmapped-mounts-v5.12
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYCegywAKCRCRxhvAZXjc
 ouJ6AQDlf+7jCQlQdeKKoN9QDFfMzG1ooemat36EpRRTONaGuAD8D9A4sUsG4+5f
 4IU5Lj9oY4DEmF8HenbWK2ZHsesL2Qg=
 =yPaw
 -----END PGP SIGNATURE-----

Merge tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull idmapped mounts from Christian Brauner:
 "This introduces idmapped mounts which has been in the making for some
  time. Simply put, different mounts can expose the same file or
  directory with different ownership. This initial implementation comes
  with ports for fat, ext4 and with Christoph's port for xfs with more
  filesystems being actively worked on by independent people and
  maintainers.

  Idmapping mounts handle a wide range of long standing use-cases. Here
  are just a few:

   - Idmapped mounts make it possible to easily share files between
     multiple users or multiple machines especially in complex
     scenarios. For example, idmapped mounts will be used in the
     implementation of portable home directories in
     systemd-homed.service(8) where they allow users to move their home
     directory to an external storage device and use it on multiple
     computers where they are assigned different uids and gids. This
     effectively makes it possible to assign random uids and gids at
     login time.

   - It is possible to share files from the host with unprivileged
     containers without having to change ownership permanently through
     chown(2).

   - It is possible to idmap a container's rootfs and without having to
     mangle every file. For example, Chromebooks use it to share the
     user's Download folder with their unprivileged containers in their
     Linux subsystem.

   - It is possible to share files between containers with
     non-overlapping idmappings.

   - Filesystem that lack a proper concept of ownership such as fat can
     use idmapped mounts to implement discretionary access (DAC)
     permission checking.

   - They allow users to efficiently changing ownership on a per-mount
     basis without having to (recursively) chown(2) all files. In
     contrast to chown (2) changing ownership of large sets of files is
     instantenous with idmapped mounts. This is especially useful when
     ownership of a whole root filesystem of a virtual machine or
     container is changed. With idmapped mounts a single syscall
     mount_setattr syscall will be sufficient to change the ownership of
     all files.

   - Idmapped mounts always take the current ownership into account as
     idmappings specify what a given uid or gid is supposed to be mapped
     to. This contrasts with the chown(2) syscall which cannot by itself
     take the current ownership of the files it changes into account. It
     simply changes the ownership to the specified uid and gid. This is
     especially problematic when recursively chown(2)ing a large set of
     files which is commong with the aforementioned portable home
     directory and container and vm scenario.

   - Idmapped mounts allow to change ownership locally, restricting it
     to specific mounts, and temporarily as the ownership changes only
     apply as long as the mount exists.

  Several userspace projects have either already put up patches and
  pull-requests for this feature or will do so should you decide to pull
  this:

   - systemd: In a wide variety of scenarios but especially right away
     in their implementation of portable home directories.

         https://systemd.io/HOME_DIRECTORY/

   - container runtimes: containerd, runC, LXD:To share data between
     host and unprivileged containers, unprivileged and privileged
     containers, etc. The pull request for idmapped mounts support in
     containerd, the default Kubernetes runtime is already up for quite
     a while now: https://github.com/containerd/containerd/pull/4734

   - The virtio-fs developers and several users have expressed interest
     in using this feature with virtual machines once virtio-fs is
     ported.

   - ChromeOS: Sharing host-directories with unprivileged containers.

  I've tightly synced with all those projects and all of those listed
  here have also expressed their need/desire for this feature on the
  mailing list. For more info on how people use this there's a bunch of
  talks about this too. Here's just two recent ones:

      https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf
      https://fosdem.org/2021/schedule/event/containers_idmap/

  This comes with an extensive xfstests suite covering both ext4 and
  xfs:

      https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts

  It covers truncation, creation, opening, xattrs, vfscaps, setid
  execution, setgid inheritance and more both with idmapped and
  non-idmapped mounts. It already helped to discover an unrelated xfs
  setgid inheritance bug which has since been fixed in mainline. It will
  be sent for inclusion with the xfstests project should you decide to
  merge this.

  In order to support per-mount idmappings vfsmounts are marked with
  user namespaces. The idmapping of the user namespace will be used to
  map the ids of vfs objects when they are accessed through that mount.
  By default all vfsmounts are marked with the initial user namespace.
  The initial user namespace is used to indicate that a mount is not
  idmapped. All operations behave as before and this is verified in the
  testsuite.

  Based on prior discussions we want to attach the whole user namespace
  and not just a dedicated idmapping struct. This allows us to reuse all
  the helpers that already exist for dealing with idmappings instead of
  introducing a whole new range of helpers. In addition, if we decide in
  the future that we are confident enough to enable unprivileged users
  to setup idmapped mounts the permission checking can take into account
  whether the caller is privileged in the user namespace the mount is
  currently marked with.

  The user namespace the mount will be marked with can be specified by
  passing a file descriptor refering to the user namespace as an
  argument to the new mount_setattr() syscall together with the new
  MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern
  of extensibility.

  The following conditions must be met in order to create an idmapped
  mount:

   - The caller must currently have the CAP_SYS_ADMIN capability in the
     user namespace the underlying filesystem has been mounted in.

   - The underlying filesystem must support idmapped mounts.

   - The mount must not already be idmapped. This also implies that the
     idmapping of a mount cannot be altered once it has been idmapped.

   - The mount must be a detached/anonymous mount, i.e. it must have
     been created by calling open_tree() with the OPEN_TREE_CLONE flag
     and it must not already have been visible in the filesystem.

  The last two points guarantee easier semantics for userspace and the
  kernel and make the implementation significantly simpler.

  By default vfsmounts are marked with the initial user namespace and no
  behavioral or performance changes are observed.

  The manpage with a detailed description can be found here:

      1d7b902e28

  In order to support idmapped mounts, filesystems need to be changed
  and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The
  patches to convert individual filesystem are not very large or
  complicated overall as can be seen from the included fat, ext4, and
  xfs ports. Patches for other filesystems are actively worked on and
  will be sent out separately. The xfstestsuite can be used to verify
  that port has been done correctly.

  The mount_setattr() syscall is motivated independent of the idmapped
  mounts patches and it's been around since July 2019. One of the most
  valuable features of the new mount api is the ability to perform
  mounts based on file descriptors only.

  Together with the lookup restrictions available in the openat2()
  RESOLVE_* flag namespace which we added in v5.6 this is the first time
  we are close to hardened and race-free (e.g. symlinks) mounting and
  path resolution.

  While userspace has started porting to the new mount api to mount
  proper filesystems and create new bind-mounts it is currently not
  possible to change mount options of an already existing bind mount in
  the new mount api since the mount_setattr() syscall is missing.

  With the addition of the mount_setattr() syscall we remove this last
  restriction and userspace can now fully port to the new mount api,
  covering every use-case the old mount api could. We also add the
  crucial ability to recursively change mount options for a whole mount
  tree, both removing and adding mount options at the same time. This
  syscall has been requested multiple times by various people and
  projects.

  There is a simple tool available at

      https://github.com/brauner/mount-idmapped

  that allows to create idmapped mounts so people can play with this
  patch series. I'll add support for the regular mount binary should you
  decide to pull this in the following weeks:

  Here's an example to a simple idmapped mount of another user's home
  directory:

	u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt

	u1001@f2-vm:/$ ls -al /home/ubuntu/
	total 28
	drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 .
	drwxr-xr-x 4 root   root   4096 Oct 28 04:00 ..
	-rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history
	-rw-r--r-- 1 ubuntu ubuntu  220 Feb 25  2020 .bash_logout
	-rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25  2020 .bashrc
	-rw-r--r-- 1 ubuntu ubuntu  807 Feb 25  2020 .profile
	-rw-r--r-- 1 ubuntu ubuntu    0 Oct 16 16:11 .sudo_as_admin_successful
	-rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo

	u1001@f2-vm:/$ ls -al /mnt/
	total 28
	drwxr-xr-x  2 u1001 u1001 4096 Oct 28 22:07 .
	drwxr-xr-x 29 root  root  4096 Oct 28 22:01 ..
	-rw-------  1 u1001 u1001 3154 Oct 28 22:12 .bash_history
	-rw-r--r--  1 u1001 u1001  220 Feb 25  2020 .bash_logout
	-rw-r--r--  1 u1001 u1001 3771 Feb 25  2020 .bashrc
	-rw-r--r--  1 u1001 u1001  807 Feb 25  2020 .profile
	-rw-r--r--  1 u1001 u1001    0 Oct 16 16:11 .sudo_as_admin_successful
	-rw-------  1 u1001 u1001 1144 Oct 28 00:43 .viminfo

	u1001@f2-vm:/$ touch /mnt/my-file

	u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file

	u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file

	u1001@f2-vm:/$ ls -al /mnt/my-file
	-rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file

	u1001@f2-vm:/$ ls -al /home/ubuntu/my-file
	-rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file

	u1001@f2-vm:/$ getfacl /mnt/my-file
	getfacl: Removing leading '/' from absolute path names
	# file: mnt/my-file
	# owner: u1001
	# group: u1001
	user::rw-
	user:u1001:rwx
	group::rw-
	mask::rwx
	other::r--

	u1001@f2-vm:/$ getfacl /home/ubuntu/my-file
	getfacl: Removing leading '/' from absolute path names
	# file: home/ubuntu/my-file
	# owner: ubuntu
	# group: ubuntu
	user::rw-
	user:ubuntu:rwx
	group::rw-
	mask::rwx
	other::r--"

* tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits)
  xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl
  xfs: support idmapped mounts
  ext4: support idmapped mounts
  fat: handle idmapped mounts
  tests: add mount_setattr() selftests
  fs: introduce MOUNT_ATTR_IDMAP
  fs: add mount_setattr()
  fs: add attr_flags_to_mnt_flags helper
  fs: split out functions to hold writers
  namespace: only take read lock in do_reconfigure_mnt()
  mount: make {lock,unlock}_mount_hash() static
  namespace: take lock_mount_hash() directly when changing flags
  nfs: do not export idmapped mounts
  overlayfs: do not mount on top of idmapped mounts
  ecryptfs: do not mount on top of idmapped mounts
  ima: handle idmapped mounts
  apparmor: handle idmapped mounts
  fs: make helpers idmap mount aware
  exec: handle idmapped mounts
  would_dump: handle idmapped mounts
  ...
2021-02-23 13:39:45 -08:00
Linus Torvalds
a99163e9e7 Devicetree updates for v5.12:
- Sync dtc to upstream version v1.6.0-51-g183df9e9c2b9 and build
   host fdtoverlay
 
 - Add kbuild support to build DT overlays (%.dtbo)
 
 - Drop NULLifying match table in of_match_device(). In preparation for
   this, there are several driver cleanups to use
   (of_)?device_get_match_data().
 
 - Drop pointless wrappers from DT struct device API
 
 - Convert USB binding schemas to use graph schema and remove old plain
   text graph binding doc
 
 - Convert spi-nor and v3d GPU bindings to DT schema
 
 - Tree wide schema fixes for if/then schemas, array size constraints,
   and undocumented compatible strings in examples
 
 - Handle 'no-map' correctly for already reserved memblock regions
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmAz1GEQHHJvYmhAa2Vy
 bmVsLm9yZwAKCRD6+121jbxhw55/D/955O2f5Gjp7bXvdoSucZtks/lqlC/eIAAw
 An5pjBL+o1urXsvafEMYemwmnwq/U4vy0aJRoAK1+MiI4masb56ET0KN5LsOudki
 b3M/O16RqGF31+blWyoxseZnZh6KsKzIRoE5XAtbr/QAnpdI0/5BgGoWSWYtDk2v
 LddL650/BieyvzdnFTLWCMAxd6DW0P9SI+8N3E+XlxAWCYQrLCqVELHbkrxAGCuN
 CggHIIiNf2K7z4UopVsGjnUwML9YRHXc9wOpF1c4CBrLu9TfDvdQ4OnNcnxcl/Sp
 E2FTHG0jSVm3VJRBbk4e68uvt3HrJJWsYnMtu2QTweGC/GbeUr9LJ0MIbSwp+rsL
 FEqCMFWOniq27eJBk6jHckaoBl93AHQlIGdJR/pFAi9Ijt32tUdVG5kqD/Tl+xKm
 njPcjVjWilr2ssfZ4tUggzPp2fjrau88ZS8qLja31vElzvULeA67KjEtG0RZAtwg
 ywfATiCaT096pR9v2VYuL/5NNnZFxHx3hWsOH1rcsyPk0BLguU3dkrAn28XBVQFd
 cOPfR3T/wsT0wHDht2aXPSM0hBiejFmvLhebGuJN9lqG+Pc1f87xiCT3pM7wymtJ
 iqTMrQ7dUgjQgU91PjatdB17tlnGHe0hh8AiuhQoPgOprpRKszG+rBFJLG3yRnl7
 QmLZnQTIhw==
 =9V4A
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:

 - Sync dtc to upstream version v1.6.0-51-g183df9e9c2b9 and build host
   fdtoverlay

 - Add kbuild support to build DT overlays (%.dtbo)

 - Drop NULLifying match table in of_match_device().

   In preparation for this, there are several driver cleanups to use
   (of_)?device_get_match_data().

 - Drop pointless wrappers from DT struct device API

 - Convert USB binding schemas to use graph schema and remove old plain
   text graph binding doc

 - Convert spi-nor and v3d GPU bindings to DT schema

 - Tree wide schema fixes for if/then schemas, array size constraints,
   and undocumented compatible strings in examples

 - Handle 'no-map' correctly for already reserved memblock regions

* tag 'devicetree-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (35 commits)
  driver core: platform: Drop of_device_node_put() wrapper
  of: Remove of_dev_{get,put}()
  dt-bindings: usb: Change descibe to describe in usbmisc-imx.txt
  dt-bindings: can: rcar_canfd: Group tuples in pin control properties
  dt-bindings: power: renesas,apmu: Group tuples in cpus properties
  dt-bindings: mtd: spi-nor: Convert to DT schema format
  dt-bindings: Use portable sort for version cmp
  dt-bindings: ethernet-controller: fix fixed-link specification
  dt-bindings: irqchip: Add node name to PRUSS INTC
  dt-bindings: interconnect: Fix the expected number of cells
  dt-bindings: Fix errors in 'if' schemas
  dt-bindings: iommu: renesas,ipmmu-vmsa: Make 'power-domains' conditionally required
  dt-bindings: Fix undocumented compatible strings in examples
  kbuild: Add support to build overlays (%.dtbo)
  scripts: dtc: Remove the unused fdtdump.c file
  scripts: dtc: Build fdtoverlay tool
  scripts/dtc: Update to upstream version v1.6.0-51-g183df9e9c2b9
  scripts: dtc: Fetch fdtoverlay.c from external DTC project
  dt-bindings: thermal: sun8i: Fix misplaced schema keyword in compatible strings
  dt-bindings: iio: dac: Fix AD5686 references
  ...
2021-02-22 10:05:12 -08:00
Linus Torvalds
05a6fb94a6 regmap: Update for v5.12
Just one simple code style improvement this time, no features.  There is
 an addition to add a new SoundWire regmap type but that should be coming
 via the SoundWire tree as the support for the underlying bus features
 was only added this cycle.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmAq2RQACgkQJNaLcl1U
 h9Dgmgf+KXNfzSmDSP3ysyfA3CS07NKAOh2WpmKLuKNq0/b4b0Z28YNy1Fm47C+a
 MwgcyTitzu4cSsRa2jxdqdZ4q+W8rKlQHpgw0tLcu9sq34Ky856rOCHWOXqYb0M2
 yFiJvS+8koTxsMe4bXr2SGO1J2MomXei7Ki6ewHEHMpZvMBwN6Ew3520QEqCbLy4
 slp21fPnM+uQP9b5o/cNbvPvJQsLf4aU9EOFTxJJfplLXxNege4P4fa+wX2GQeGm
 Ent7njQHT1l27PY9KHpWSCg8INit/iEEzYJ9XzuXv6682WxvYtDMC/CYmmH7rnKu
 ccklGJdhgd1y71DNC953noemybis0g==
 =INU0
 -----END PGP SIGNATURE-----

Merge tag 'regmap-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap update from Mark Brown:
 "Just one simple code style improvement this time, no features.

  There is an addition to add a new SoundWire regmap type but that
  should be coming via the SoundWire tree as the support for the
  underlying bus features was only added this cycle"

* tag 'regmap-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: Assign boolean values to a bool variable
2021-02-22 09:12:22 -08:00
Linus Torvalds
10e2ec8ede sound updates for 5.12
A relatively calm release at this time, and no massive code changes
 are found in the stats, while a wide range of code refactoring and
 cleanup have been done.
 
 Note that this update includes the tree-wide trivial changes for
 dropping the return value from ISA remove callbacks, too.
 
 Below lists up some highlight:
 
 * ALSA Core:
 - Support for the software jack injection via debugfs
 - Fixes for sync_stop PCM operations
 
 * HD-audio and USB-audio:
 - A few usual HD-audio device quirks
 - Updates for Tegra HD-audio
 - More quirks for Pioneer and other USB-audio devices
 - Stricter state checks at USB-audio disconnection
 
 * ASoC:
 - Continued code refactoring, cleanup and fixes in ASoC core API
 - A KUnit testsuite for the topology code
 - Lots of ASoC Intel driver Realtek codec updates, quirk additions and
   fixes
 - Support for Ingenic JZ4760(B), Intel AlderLake-P, DT configured
   nVidia cards, Qualcomm lpass-rx-macro and lpass-tx-macro
 - Removal of obsolete SIRF prima/atlas, Txx9 and ZTE zx drivers
 
 * Others:
 - Drop return value from ISA driver remove callback
 - Cleanup with DIV_ROUND_UP() macro
 - FireWire updates, HDSP output loopback support
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmAvhXoOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE9XGA/+MmRBSMipHpZBj6AB2mxbsam2dbPHuycKz1Dd
 7W4Rx9QdQcCF2BQ909HKSaE76mTrxkaYc3Ubn8uyfeKz7tB9YpqY5HfIiWRz8iyU
 FJK/INbkeunLhS61wjKbb8x+pP5M2ZXBTGSRkgVROCgMq4osM+J17c/5wSPE5BoG
 BGTXUk8LcDE+Iq/6bt2OrXgEBhHCXw7eB/wRWw5v0sIc2cnrexXYUZmHaRj1L3Dd
 ukpteFEmemOdbowitV+GPSlsnrCD6zselYWms/MLvwLMvTqT4W2SRfsGF5VvGKJC
 AJsHTWQ5JRKfLt2LJkDs3ymHrKdhnDCWjCUAFNEXd7IRG0Qsk/S+wXsyl3oEhgeQ
 ND9RoE5pSGG/2Y3Zvt3OevAuVenzQW04/2hFIoAyQg5s/DSom8lNtAsmXkc5dWNI
 GZJHnvPrdKgzZ0lI9TAbG0v48lnyiQB2sD0FAatWpv3NHcRt0u3fowZgc6Bb3JHK
 7cv3upNa1CY7mDSYiT0k3sIHJMrCdoWTPSiewEOxrmLFM1r5O5gHX3dpXhSfh5WJ
 MI1a93N7sK6WHm6KpeNcHnjrIbP14vOjatOHN+0stuFhLcOGygDX/L0Lu07+15aJ
 Fxicp23RRwNsb57JcTZTw/+nZhrndSeG3eHYZG2QvQJCv6Ph1tEJ+WAM+tlj85GT
 feGP0jg=
 =QgvS
 -----END PGP SIGNATURE-----

Merge tag 'sound-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "A relatively calm release at this time, and no massive code changes
  are found in the stats, while a wide range of code refactoring and
  cleanup have been done.

  Note that this update includes the tree-wide trivial changes for
  dropping the return value from ISA remove callbacks, too.

  Below lists up some highlight:

  ALSA Core:
   - Support for the software jack injection via debugfs
   - Fixes for sync_stop PCM operations

  HD-audio and USB-audio:
   - A few usual HD-audio device quirks
   - Updates for Tegra HD-audio
   - More quirks for Pioneer and other USB-audio devices
   - Stricter state checks at USB-audio disconnection

  ASoC:
   - Continued code refactoring, cleanup and fixes in ASoC core API
   - A KUnit testsuite for the topology code
   - Lots of ASoC Intel driver Realtek codec updates, quirk additions
     and fixes
   - Support for Ingenic JZ4760(B), Intel AlderLake-P, DT configured
     nVidia cards, Qualcomm lpass-rx-macro and lpass-tx-macro
   - Removal of obsolete SIRF prima/atlas, Txx9 and ZTE zx drivers

  Others:
   - Drop return value from ISA driver remove callback
   - Cleanup with DIV_ROUND_UP() macro
   - FireWire updates, HDSP output loopback support"

* tag 'sound-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (322 commits)
  ALSA: hda: intel-dsp-config: add Alder Lake support
  ASoC: soc-pcm: fix hw param limits calculation for multi-DAI
  ASoC: Intel: bytcr_rt5640: Add quirk for the Acer One S1002 tablet
  ASoC: Intel: bytcr_rt5651: Add quirk for the Jumper EZpad 7 tablet
  ASoC: Intel: bytcr_rt5640: Add quirk for the Voyo Winpad A15 tablet
  ASoC: Intel: bytcr_rt5640: Add quirk for the Estar Beauty HD MID 7316R tablet
  ASoC: soc-pcm: fix hwparams min/max init for dpcm
  ALSA: hda/realtek: Quirk for HP Spectre x360 14 amp setup
  ALSA: usb-audio: Add implicit fb quirk for BOSS GP-10
  ALSA: hda: Add another CometLake-H PCI ID
  ASoC: soc-pcm: add soc_pcm_hw_update_format()
  ASoC: soc-pcm: add soc_pcm_hw_update_chan()
  ASoC: soc-pcm: add soc_pcm_hw_update_rate()
  ASoC: wm_adsp: Remove unused control callback structure
  ASoC: SOF: relax ABI checks and avoid unnecessary warnings
  ASoC: codecs: lpass-tx-macro: add dapm widgets and route
  ASoC: codecs: lpass-tx-macro: add support for lpass tx macro
  ASoC: qcom: dt-bindings: add bindings for lpass tx macro codec
  ASoC: codecs: lpass-rx-macro: add iir widgets
  ASoC: codecs: lpass-rx-macro: add dapm widgets and route
  ...
2021-02-21 14:21:35 -08:00
Linus Torvalds
de16175788 media updates for v5.12-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmAtJ+IACgkQCF8+vY7k
 4RX8eBAAhTzVFopTBMAW2+FjBTtUGwn+LnrIG9O1HrFp4yjTfe/MnZWXkVRZXjqo
 cYZehdab0j8636aLTs86Y6mEMHGPdm6V0hQhgvXoy7FqQqLq52K1bpXL+4a0lNYx
 HE8OLbOvSM49RlP9ZU978NuUzfWLCW+dGlXuGxdJDU/fmbKdaSjvelRjjfNFhBo3
 ENK2LXVnebvtttjq4uSQ5LjeJEBBsIldK947/lvu7zJnnfDlXXdtrsuonkWvRp+s
 8M1+AQ0F/edKX1atXSCZZqLNhUNaswHWc6lMmIL8qGvMZjZffWi4KwfcB0XXvrAW
 IJYfaLQ9kvEaFaSLZ3E5dCPO5CQLUkR4YOmSSUdK16fpyb1WzVjWsKPUjxsk5IeB
 IitjX5KkP5T+uA8pmzQE9dX2Do7no9A/765f2uqpaQxYbze1IT+6qWMisLrlguZe
 NV10Fah2dSehmqqfnnIjDE40rP3iff6xKheTeLzF1e4j8JiNDPCRI8z1i8M2OJ1e
 jIEC4Pq4/mGmn+InJOzxPloel1CnCL+d0bU/wrAhEyg0Ss+M95/+KgK6LCEzgyei
 /+2II2tABxtanO8mxp4jts3jPduqVuV9EEpWquzf9bPqFy5mBFD3NbOJNn/5ZVlx
 /DhvjRxiEedQihQ9Pt0OQxiJm6InopaeTihAvMQrMH3nLBsF2/Y=
 =/wKL
 -----END PGP SIGNATURE-----

Merge tag 'media/v5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media updates from Mauro Carvalho Chehab:

 - some core fixes in VB2 mem2mem support

 - some improvements and cleanups in V4L2 async kAPI

 - newer controls in V4L2 API for H-264 and HEVC codecs

 - allegro-dvt driver was promoted from staging

 - new i2c sendor drivers: imx334, ov5648, ov8865

 - new automobile camera module: rdacm21

 - ipu3 cio2 driver started gained support for some ACPI BIOSes

 - new ATSC frontend: MaxLinear mxl692 VSB tuner/demod

 - the SMIA/CCS driver gained more support for CSS standard

 - several driver fixes, updates and improvements

* tag 'media/v5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (362 commits)
  media: v4l: async: Fix kerneldoc documentation for async functions
  media: i2c: max9271: Add MODULE_* macros
  media: i2c: Kconfig: Make MAX9271 a module
  media: imx334: 'ret' is uninitialized, should have been PTR_ERR()
  media: i2c: Add imx334 camera sensor driver
  media: dt-bindings: media: Add bindings for imx334
  media: ov8856: Configure sensor for GRBG Bayer for all modes
  media: i2c: imx219: Implement V4L2_CID_LINK_FREQ control
  media: ov5675: fix vflip/hflip control
  media: ipu3-cio2: Build bridge only if ACPI is enabled
  media: Remove the legacy v4l2-clk API
  media: ov6650: Use the generic clock framework
  media: mt9m111: Use the generic clock framework
  media: ov9640: Use the generic clock framework
  media: pxa_camera: Drop the v4l2-clk clock register
  media: mach-pxa: Register the camera sensor fixed-rate clock
  media: i2c: imx258: get clock from device properties and enable it via runtime PM
  media: i2c: imx258: simplify getting state container
  media: i2c: imx258: add support for binding via device tree
  media: dt-bindings: media: imx258: add bindings for IMX258 sensor
  ...
2021-02-21 14:10:36 -08:00