linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-28 07:01:32 +00:00

Author	SHA1	Message	Date
Peng Fan	85714108e6	drivers: base: dma-mapping: page align the size when unmap_kernel_range When dma_common_free_remap, the input parameter 'size' may not be page aligned. And, met kernel warning when doing iommu dma for usb on i.MX8 platform: " WARNING: CPU: 0 PID: 869 at mm/vmalloc.c:70 vunmap_page_range+0x1cc/0x1d0() Modules linked in: CPU: 0 PID: 869 Comm: kworker/u8:2 Not tainted 4.1.12-00444-gc5f9d1d-dirty #147 Hardware name: Freescale i.MX8DV Sabreauto (DT) Workqueue: ci_otg ci_otg_work Call trace: [<ffffffc000089920>] dump_backtrace+0x0/0x124 [<ffffffc000089a54>] show_stack+0x10/0x1c [<ffffffc0006d1e6c>] dump_stack+0x84/0xc8 [<ffffffc0000b4568>] warn_slowpath_common+0x98/0xd0 [<ffffffc0000b4664>] warn_slowpath_null+0x14/0x20 [<ffffffc000170348>] vunmap_page_range+0x1c8/0x1d0 [<ffffffc000170388>] unmap_kernel_range+0x20/0x88 [<ffffffc000460ad0>] dma_common_free_remap+0x74/0x84 [<ffffffc0000940d8>] __iommu_free_attrs+0x9c/0x178 [<ffffffc0005032bc>] ehci_mem_cleanup+0x140/0x194 [<ffffffc000503548>] ehci_stop+0x8c/0xdc [<ffffffc0004e8258>] usb_remove_hcd+0xf0/0x1cc [<ffffffc000516bc0>] host_stop+0x1c/0x58 [<ffffffc000514240>] ci_otg_work+0xdc/0x120 [<ffffffc0000c9c34>] process_one_work+0x134/0x33c [<ffffffc0000c9f78>] worker_thread+0x13c/0x47c [<ffffffc0000cf43c>] kthread+0xd8/0xf0 " For dma_common_pages_remap: dma_common_pages_remap \|->get_vm_area_caller \|->__get_vm_area_node \|->size = PAGE_ALIGN(size); Round up to page aligned So, in dma_common_free_remap, we also need a page aligned size, pass 'PAGE_ALIGN(size)' to unmap_kernel_range. Signed-off-by: Peng Fan <van.freenix@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-08-31 15:13:56 +02:00
Jerome Marchand	c90aab9c96	platform driver: fix use-after-free in platform_device_del() In platform_device_del(), the device is still used after a call to device_del(). At this point there is no guarantee that the device is still there and there could be a use-after-free access. Move the call to device_remove_properties() before device_del() to fix that. Signed-off-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-08-31 15:13:56 +02:00
Rob Herring	bea5b158ff	driver core: add test of driver remove calls during probe In recent discussions on ksummit-discuss[1], it was suggested to do a sequence of probe, remove, probe for testing driver remove paths. This adds a kconfig option for said test. [1] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003459.html Suggested-by: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-08-31 15:13:55 +02:00
Ming Lei	cebf8fd169	driver core: fix race between creating/querying glue dir and its cleanup The global mutex of 'gdp_mutex' is used to serialize creating/querying glue dir and its cleanup. Turns out it isn't a perfect way because part(kobj_kset_leave()) of the actual cleanup action() is done inside the release handler of the glue dir kobject. That means gdp_mutex has to be held before releasing the last reference count of the glue dir kobject. This patch moves glue dir's cleanup after kobject_del() in device_del() for avoiding the race. Cc: Yijing Wang <wangyijing@huawei.com> Reported-by: Chandra Sekhar Lingutla <clingutla@codeaurora.org> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-08-31 15:13:55 +02:00
Paul E. McKenney	d7737ce964	PM / runtime: Add _rcuidle suffix to allow rpm_idle() use from idle This commit appends a few _rcuidle suffixes to fix the following RCU-used-from-idle bug: > =============================== > [ INFO: suspicious RCU usage. ] > 4.6.0-rc5-next-20160426+ #1116 Not tainted > ------------------------------- > include/trace/events/rpm.h:95 suspicious rcu_dereference_check() usage! > > other info that might help us debug this: > > > RCU used illegally from idle CPU! > rcu_scheduler_active = 1, debug_locks = 0 > RCU used illegally from extended quiescent state! > 1 lock held by swapper/0/0: > #0: (&(&dev->power.lock)->rlock){-.-...}, at: [<c052cc2c>] __rpm_callback+0x58/0x60 > > stack backtrace: > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1116 > Hardware name: Generic OMAP36xx (Flattened Device Tree) > [<c0110290>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14) > [<c010c3a8>] (show_stack) from [<c047fd68>] (dump_stack+0xb0/0xe4) > [<c047fd68>] (dump_stack) from [<c052d5d0>] (rpm_suspend+0x580/0x768) > [<c052d5d0>] (rpm_suspend) from [<c052ec58>] (__pm_runtime_suspend+0x64/0x84) > [<c052ec58>] (__pm_runtime_suspend) from [<c04bf25c>] (omap2_gpio_prepare_for_idle+0x5c/0x70) > [<c04bf25c>] (omap2_gpio_prepare_for_idle) from [<c0125568>] (omap_sram_idle+0x140/0x244) > [<c0125568>] (omap_sram_idle) from [<c01269dc>] (omap3_enter_idle_bm+0xfc/0x1ec) > [<c01269dc>] (omap3_enter_idle_bm) from [<c0601bdc>] (cpuidle_enter_state+0x80/0x3d4) > [<c0601bdc>] (cpuidle_enter_state) from [<c0183b08>] (cpu_startup_entry+0x198/0x3a0) > [<c0183b08>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8) > [<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c) In the immortal words of Steven Rostedt, "Whack Whack Whack!!!" Reported-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Guenter Roeck <linux@roeck-us.net> WhACKED-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-08-31 03:00:59 +02:00
Paul E. McKenney	d44c950e93	PM / runtime: Add _rcuidle suffix to allow rpm_resume() to be called from idle This commit applies another _rcuidle suffix to fix an RCU use from idle. > =============================== > [ INFO: suspicious RCU usage. ] > 4.6.0-rc5-next-20160426+ #1122 Not tainted > ------------------------------- > include/trace/events/rpm.h:69 suspicious rcu_dereference_check() usage! > > other info that might help us debug this: > > > RCU used illegally from idle CPU! > rcu_scheduler_active = 1, debug_locks = 0 > RCU used illegally from extended quiescent state! > 1 lock held by swapper/0/0: > #0: (&(&dev->power.lock)->rlock){-.-...}, at: [<c052e3dc>] __pm_runtime_resume+0x3c/0x64 > > stack backtrace: > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1122 > Hardware name: Generic OMAP36xx (Flattened Device Tree) > [<c0110290>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14) > [<c010c3a8>] (show_stack) from [<c047fd68>] (dump_stack+0xb0/0xe4) > [<c047fd68>] (dump_stack) from [<c052e178>] (rpm_resume+0x5cc/0x7f4) > [<c052e178>] (rpm_resume) from [<c052e3ec>] (__pm_runtime_resume+0x4c/0x64) > [<c052e3ec>] (__pm_runtime_resume) from [<c04bf2c4>] (omap2_gpio_resume_after_idle+0x54/0x68) > [<c04bf2c4>] (omap2_gpio_resume_after_idle) from [<c01269dc>] (omap3_enter_idle_bm+0xfc/0x1ec) > [<c01269dc>] (omap3_enter_idle_bm) from [<c060198c>] (cpuidle_enter_state+0x80/0x3d4) > [<c060198c>] (cpuidle_enter_state) from [<c0183b08>] (cpu_startup_entry+0x198/0x3a0) > [<c0183b08>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8) > [<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c) Reported-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-08-31 02:59:20 +02:00
Lukas Wunner	478573c93a	driver core: Don't leak secondary fwnode on device removal If device_add_property_set() is called for a device, a secondary fwnode is allocated and assigned to the device but currently not freed once the device is removed. This can be triggered on Apple Macs if a Thunderbolt device is plugged in on boot since Apple's NHI EFI driver sets a number of properties for that device which are leaked on unplug. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-08-31 00:28:29 +02:00
Elaine Zhang	815806e39b	regmap: drop cache if the bus transfer error regmap_write ->_regmap_raw_write -->regcache_write first and than use map->bus->write to wirte i2c or spi But if the i2c or spi transfer failed, But the cache is updated, So if I use regmap_read will get the cache data which is not the real register value. Signed-off-by: Elaine Zhang <zhangqing@rock-chips.com> Signed-off-by: Mark Brown <broonie@kernel.org>	2016-08-18 11:09:12 +01:00
Jon Hunter	8b0510b524	PM / Domains: Always enable debugfs support if available Debugfs support for PM domains is only enabled if both CONFIG_PM_DEBUG and CONFIG_PM_ADVANCED_DEBUG are enabled. CONFIG_PM_ADVANCED_DEBUG is described as "extra PM attributes in sysfs for low-level debugging/testing" which does not seem related. Given that the debugfs for PM domains only allows users to view the state of the PM domains, always enable debugfs support for PM domains if PM domains and debugfs support is enabled. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-08-12 02:27:53 +02:00
Cristian Birsan	359a2f1760	regmap: debugfs: Add support for dumping write only device registers Add support for dumping write only device registers in debugfs. This is useful for audio codecs that have write only registers (like WM8731). The logic that decides if a value can be printed is moved to regmap_printable() function to allow for easier future updates. Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com> Signed-off-by: Mark Brown <broonie@kernel.org>	2016-08-09 13:43:33 +01:00
Cristian Birsan	1ea975cf1e	regmap: Add a function to check if a regmap register is cached Add a function to check if a regmap register is cached. This will be used in debugfs to dump the cached values of write only registers. Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com> Signed-off-by: Mark Brown <broonie@kernel.org>	2016-08-09 13:43:33 +01:00
Linus Torvalds	11d8ec408d	More power management updates for v4.8-rc1 - Prevent the low-level assembly hibernate code on x86-64 from referring to __PAGE_OFFSET directly as a symbol which doesn't work when the kernel identity mapping base is randomized, in which case __PAGE_OFFSET is a variable (Rafael Wysocki). - Avoid selecting CPU_FREQ_STAT by default as the statistics are not required for proper cpufreq operation (Borislav Petkov). - Add Skylake-X and Broadwell-X IDs to the intel_pstate's list of processors where out-of-band (OBB) control of P-states is possible and if that is in use, intel_pstate should not attempt to manage P-states (Srinivas Pandruvada). - Drop some unnecessary checks from the wakeup IRQ handling code in the PM core (Markus Elfring). - Reduce the number operating performance point (OPP) lookups in one of the OPP framework's helper functions (Jisheng Zhang). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJXpKLNAAoJEILEb/54YlRx6L4P/39GR0kVB7vIyCajdWc4f3gh 0zh5RbKC0YT4F6upebzgQ9uYS9cv4y+df7ShVwKQAea3wDReEmZhM/egOGw0Ls+8 SS1MiJq1LSekyMIWH6cXwZsH69/V0LuWTBbzWYBgHUDbfEMlgwV5ZZMEH14/2bWw d4SLUUiW5P42im+IDAxpdYneKOrbJo3txj6WbOutgtIrHdPko6lF1dDouKvI1QTk zCBOkEB9nELq3rWN/sbPmHzbmbj/yFiiHk5+iqwuKKJZI8PQB9/C6Qmc3wvjtPpc GXLPI+OHqLgBMofGsiKOvm3hPQAIjf/ERsUilHLE3qOi/Mi0qj7U4dFivZrPWaCG j2bD+b36TffmG1r8L7NYbEKU60syeIFSRqAngbyswu6XF+NdVboaifENGcgM3tBC pUC1mFh/4PMKP5zW9mOwE6WSntZkw14CVR+A3fRFOuTBavNvjGdckwLi/aBsZdU3 K4DJUFzdELF7+JdqnQV35yV2tgMbJhxQa/QBykiFBh3AyqliOZ8uBoIxxLinrGmH XWR3kB4ZPRBIStGI9IpG3lNhLLU7mLIdaFhGayicicAwFcLsXN7oKWVzYfUqWA9T 1ptXApcRf6S9J1JnKHkznoQ1/D1pNYLbH+t7BOtWdK8SWBhMS2Lzo2kyKWsCFkJS 16J8j1tcLDhN1dvcBT55 =WvPB -----END PGP SIGNATURE----- Merge tag 'pm-extra-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "A few more fixes and cleanups in the x86-64 low-level hibernation code, PM core, cpufreq (Kconfig and intel_pstate), and the operating points framework. Specifics: - Prevent the low-level assembly hibernate code on x86-64 from referring to __PAGE_OFFSET directly as a symbol which doesn't work when the kernel identity mapping base is randomized, in which case __PAGE_OFFSET is a variable (Rafael Wysocki). - Avoid selecting CPU_FREQ_STAT by default as the statistics are not required for proper cpufreq operation (Borislav Petkov). - Add Skylake-X and Broadwell-X IDs to the intel_pstate's list of processors where out-of-band (OBB) control of P-states is possible and if that is in use, intel_pstate should not attempt to manage P-states (Srinivas Pandruvada). - Drop some unnecessary checks from the wakeup IRQ handling code in the PM core (Markus Elfring). - Reduce the number operating performance point (OPP) lookups in one of the OPP framework's helper functions (Jisheng Zhang)" * tag 'pm-extra-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: x86/power/64: Do not refer to __PAGE_OFFSET from assembly code cpufreq: Do not default-yes CPU_FREQ_STAT cpufreq: intel_pstate: Add more out-of-band IDs PM / OPP: optimize dev_pm_opp_set_rate() performance a bit PM-wakeup: Delete unnecessary checks before three function calls	2016-08-05 23:26:16 -04:00
Linus Torvalds	6c84239d59	RTC for 4.8 Cleanups: - huge cleanup of rtc-generic and char/genrtc this allowed to cleanup rtc-cmos, rtc-sh, rtc-m68k, rtc-powerpc and rtc-parisc - move mn10300 to rtc-cmos Subsystem: - fix wakealarms after hibernate - multiples fixes for rctest - simplify implementations of .read_alarm New drivers: - Maxim MAX6916 Drivers: - ds1307: fix weekday - m41t80: add wakeup support - pcf85063: add support for PCF85063A variant - rv8803: extend i2c fix and other fixes - s35390a: fix alarm reading, this fixes instant reboot after shutdown for QNAP TS-41x - s3c: clock fixes -----BEGIN PGP SIGNATURE----- iQIcBAABCgAGBQJXokhIAAoJENiigzvaE+LCZqQP+wWzintN/N1u3dKiVB7iSdwq +S/jAXD9wW8OK9PI60/YUGRYeUXmZW9t4XYg1VKCxU9KpVC17LgOtDyXD8BufP1V uREJEzZw9O7zCCjeHp/ICFjBkc62Net6ZDOO+ZyXPNfddpS1Xq1uUgXLZc/202UR ID/kewu0pJRDnoxyqznWn9+8D33w/ygXs2slY2Ive0ONtjdgxGcsj2rNbb2RYn2z OP7br3lLg7qkFh4TtXb61eh/9GYIk6wzP/CrX5l/jH4SjQnrIk5g/X/Cd1qQ/qso JZzFoonOKvIp5Gw/+fZ9NP3YFcnkoRMv4NjZV8PAmsYLds+ibRiBcoB8u6FmiJV7 WW5uopgPkfCGN5BV3+QHwJDVe+WlgnlzaT5zPUCcP5KWusDts4fWIgzP7vrtAzf4 3OJLrgSGdBeOqWnJD21nxKUD27JOseX7D+BFtwxR4lMsXHqlHJfETpZ8gts1ZGH3 2U353j/jkZvGWmc6dMcuxOXT2K4VqpYeIIqs0IcLu6hM9crtR89zPR2Iu1AilfDW h2NroF+Q//SgMMzWoTEG6Tn7RAc7MthgA/tRCFZF9CBMzNs988w0CTHnKsIHmjpU UKkMeJGAC9YrPYIcqrg0oYsmLUWXc8JuZbGJBnei3BzbaMTlcwIN9qj36zfq6xWc TMLpbWEoIsgFIZMP/hAP =rpGB -----END PGP SIGNATURE----- Merge tag 'rtc-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "RTC for 4.8 Cleanups: - huge cleanup of rtc-generic and char/genrtc this allowed to cleanup rtc-cmos, rtc-sh, rtc-m68k, rtc-powerpc and rtc-parisc - move mn10300 to rtc-cmos Subsystem: - fix wakealarms after hibernate - multiples fixes for rctest - simplify implementations of .read_alarm New drivers: - Maxim MAX6916 Drivers: - ds1307: fix weekday - m41t80: add wakeup support - pcf85063: add support for PCF85063A variant - rv8803: extend i2c fix and other fixes - s35390a: fix alarm reading, this fixes instant reboot after shutdown for QNAP TS-41x - s3c: clock fixes" * tag 'rtc-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (65 commits) rtc: rv8803: Clear V1F when setting the time rtc: rv8803: Stop the clock while setting the time rtc: rv8803: Always apply the I²C workaround rtc: rv8803: Fix read day of week rtc: rv8803: Remove the check for valid time rtc: rv8803: Kconfig: Indicate rx8900 support rtc: asm9260: remove .owner field for driver rtc: at91sam9: Fix missing spin_lock_init() rtc: m41t80: add suspend handlers for alarm IRQ rtc: m41t80: make it a real error message rtc: pcf85063: Add support for the PCF85063A device rtc: pcf85063: fix year range rtc: hym8563: in .read_alarm set .tm_sec to 0 to signal minute accuracy rtc: explicitly set tm_sec = 0 for drivers with minute accurancy rtc: s3c: Add s3c_rtc_{enable/disable}_clk in s3c_rtc_setfreq() rtc: s3c: Remove unnecessary call to disable already disabled clock rtc: abx80x: use devm_add_action_or_reset() rtc: m41t80: use devm_add_action_or_reset() rtc: fix a typo and reduce three empty lines to one rtc: s35390a: improve two comments in .set_alarm ...	2016-08-05 09:48:22 -04:00
Rafael J. Wysocki	e2b3b80de5	Merge branches 'pm-sleep', 'pm-cpufreq', 'pm-core' and 'pm-opp' * pm-sleep: x86/power/64: Do not refer to __PAGE_OFFSET from assembly code * pm-cpufreq: cpufreq: Do not default-yes CPU_FREQ_STAT cpufreq: intel_pstate: Add more out-of-band IDs * pm-core: PM-wakeup: Delete unnecessary checks before three function calls * pm-opp: PM / OPP: optimize dev_pm_opp_set_rate() performance a bit	2016-08-05 15:46:55 +02:00
Lars-Peter Clausen	1bc8da4e14	regmap: rbtree: Avoid overlapping nodes When searching for a suitable node that should be used for inserting a new register, which does not fall within the range of any existing node, we not only looks for nodes which are directly adjacent to the new register, but for nodes within a certain proximity. This is done to avoid creating lots of small nodes with just a few registers spacing in between, which would increase memory usage as well as tree traversal time. This means there might be multiple node candidates which fall within the proximity range of the new register. If we choose the first node we encounter, under certain register insertion patterns it is possible to end up with overlapping ranges. This will break order in the rbtree and can cause the cached register value to become corrupted. E.g. take the simplified example where the proximity range is 2 and the register insertion sequence is 1, 4, 2, 3, 5. * Insert of register 1 creates a new node, this is the root of the rbtree * Insert of register 4 creates a new node, which is inserted to the right of the root. * Insert of register 2 gets inserted to the first node * Insert of register 3 gets inserted to the first node * Insert of register 5 also gets inserted into the first node since this is the first node encountered and it is within the proximity range. Now there are two overlapping nodes. To avoid this always choose the node that is closest to the new register. This will ensure that nodes will not overlap. The tree traversal is still done as a binary search, we just don't stop at the first node found. So the complexity of the algorithm stays within the same order. Ideally if a new register is in the range of two adjacent blocks those blocks should be merged, but that is a much more invasive change and left for later. The issue was initially introduced in commit `472fdec738` ("regmap: rbtree: Reduce number of nodes, take 2"), but became much more exposed by commit `6399aea629` ("regmap: rbtree: When adding a reg do a bsearch for target node") which changed the order in which nodes are looked-up. Fixes: `6399aea629` ("regmap: rbtree: When adding a reg do a bsearch for target node") Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Mark Brown <broonie@kernel.org>	2016-08-04 16:55:26 +01:00
Stephen Boyd	a098ecd2fa	firmware: support loading into a pre-allocated buffer Some systems are memory constrained but they need to load very large firmwares. The firmware subsystem allows drivers to request this firmware be loaded from the filesystem, but this requires that the entire firmware be loaded into kernel memory first before it's provided to the driver. This can lead to a situation where we map the firmware twice, once to load the firmware into kernel memory and once to copy the firmware into the final resting place. This creates needless memory pressure and delays loading because we have to copy from kernel memory to somewhere else. Let's add a request_firmware_into_buf() API that allows drivers to request firmware be loaded directly into a pre-allocated buffer. This skips the intermediate step of allocating a buffer in kernel memory to hold the firmware image while it's read from the filesystem. It also requires that drivers know how much memory they'll require before requesting the firmware and negates any benefits of firmware caching because the firmware layer doesn't manage the buffer lifetime. For a 16MB buffer, about half the time is spent performing a memcpy from the buffer to the final resting place. I see loading times go from 0.081171 seconds to 0.047696 seconds after applying this patch. Plus the vmalloc pressure is reduced. This is based on a patch from Vikram Mulukutla on codeaurora.org: https://www.codeaurora.org/cgit/quic/la/kernel/msm-3.18/commit/drivers/base/firmware_class.c?h=rel/msm-3.18&id=0a328c5f6cd999f5c591f172216835636f39bcb5 Link: http://lkml.kernel.org/r/20160607164741.31849-4-stephen.boyd@linaro.org Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Vikram Mulukutla <markivx@codeaurora.org> Cc: Mark Brown <broonie@kernel.org> Cc: Ming Lei <ming.lei@canonical.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-08-02 19:35:10 -04:00
Vikram Mulukutla	0e742e9275	firmware: provide infrastructure to make fw caching optional Some low memory systems with complex peripherals cannot afford to have the relatively large firmware images taking up valuable memory during suspend and resume. Change the internal implementation of firmware_class to disallow caching based on a configurable option. In the near future, variants of request_firmware will take advantage of this feature. Link: http://lkml.kernel.org/r/20160607164741.31849-3-stephen.boyd@linaro.org [stephen.boyd@linaro.org: Drop firmware_desc design and use flags] Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org> Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Ming Lei <ming.lei@canonical.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-08-02 19:35:09 -04:00
Stephen Boyd	9ccf981198	firmware: consolidate kmap/read/write logic Some systems are memory constrained but they need to load very large firmwares. The firmware subsystem allows drivers to request this firmware be loaded from the filesystem, but this requires that the entire firmware be loaded into kernel memory first before it's provided to the driver. This can lead to a situation where we map the firmware twice, once to load the firmware into kernel memory and once to copy the firmware into the final resting place. This design creates needless memory pressure and delays loading because we have to copy from kernel memory to somewhere else. This patch sets adds support to the request firmware API to load the firmware directly into a pre-allocated buffer, skipping the intermediate copying step and alleviating memory pressure during firmware loading. The drawback is that we can't use the firmware caching feature because the memory for the firmware cache is not managed by the firmware layer. This patch (of 3): We use similar structured code to read and write the kmapped firmware pages. The only difference is read copies from the kmap region and write copies to it. Consolidate this into one function to reduce duplication. Link: http://lkml.kernel.org/r/20160607164741.31849-2-stephen.boyd@linaro.org Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org> Cc: Vikram Mulukutla <markivx@codeaurora.org> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Ming Lei <ming.lei@canonical.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-08-02 19:35:09 -04:00
Fabian Frederick	bd721ea73e	treewide: replace obsolete _refok by __ref There was only one use of __initdata_refok and __exit_refok __init_refok was used 46 times against 82 for __ref. Those definitions are obsolete since commit `312b1485fb` ("Introduce new section reference annotations tags: __ref, __refdata, __refconst") This patch removes the following compatibility definitions and replaces them treewide. /* compatibility defines */ #define __init_refok __ref #define __initdata_refok __refdata #define __exit_refok __ref I can also provide separate patches if necessary. (One patch per tree and check in 1 month or 2 to remove old definitions) [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/1466796271-3043-1-git-send-email-fabf@skynet.be Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Ingo Molnar <mingo@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-08-02 17:31:41 -04:00
Linus Torvalds	c9b95e5961	sound updates for 4.8 Majority of this update is about ASoC, including a few new drivers, and the rest are mostly minor changes. The only substantial change in ALSA core is about the additional error handling in the compress-offload API. Below are highlights: - Add the error propagating support in compress-offload API - HD-audio: a usual Dell headset fixup, an Intel HDMI/DP fix, and the default mixer setup change ot turn off the loopback - Lots of updates for ASoC Intel drivers, mostly board support and bug fixing, and to the NAU8825 driver - Work on generalizing bits of simple-card to allow more code sharing with the Renesas rsrc-card (which can't use simple-card due to DPCM) - Removal of the Odroid X2 driver due to replacement with simple-card - Support for several new Mediatek platforms and associated boards - New ASoC drivers for Allwinner A10, Analog Devices ADAU7002, Broadcom Cygnus, Cirrus Logic CS35L33 and CS53L30, Maxim MAX8960 and MAX98504, Realtek RT5514 and Wolfson WM8758 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIrBAABCAAVBQJXnaohDhx0aXdhaUBzdXNlLmRlAAoJEGwxgFQ9KSmkdh0P+wde 9YXRP7lugnai2keRR5ibev118qHvwcoLycdNSSutjydK1gbZtRNcVOiAO9KRKqxs nv4lM7TduFHDjsBrD26iMxFHqdnyuI2Xd5nddpPfTteCLtLLu72RYrRrubZB/SC9 Pi2PwMg+FkwaAG+/HXcWFIDNIxqhUXg2jHIhWlvWyvEvIZgJntvZaD1vw3NQqmE/ EAbafpdNXewY2IPxsQ+4zVD6JUTh4RTY790R4vZptO7E0oRksmffzAbX+XiYpiam pVwZAhyj05jy4DGsio89ufA51EC58oV8xfo6jxQh6wiXjr7ArhBh/60EoV7VMYh7 aouFQ+EOox3p/rcoyaaHbGROnCk0WIc2milQHM41F3Q6iC0fxOep14kSjxOXXqYg tgIjduptExh9XsEdpTCHfnmOyXEq+7PFbqmfluoKvG1q//k4u3d8SEicBfJMuWO+ Fb/v2ngii0r3D7rwsl4ONEShJdVd+mbpg2d6DoOu1/UYMfTGaxR/vF3DD7gCmn0A qikZkOnmmVttDVc8YlsmMZyo2ATU3AWsnNhdZGDGGT5IaAppZ+h3H5JZKNhxo5my cK6JbSMCljkmcc8GM990P6BNngya3doiu5aox8hM2uWugHcWYvjyoigGUhGVJdlT DgGtxlZ+5dQptdP0gAe5qO/Fzn9gfwvZQZxgSdaS =zx3K -----END PGP SIGNATURE----- Merge tag 'sound-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "The majority of this update is about ASoC, including a few new drivers, and the rest are mostly minor changes. The only substantial change in ALSA core is about the additional error handling in the compress-offload API. Below are highlights: - Add the error propagating support in compress-offload API - HD-audio: a usual Dell headset fixup, an Intel HDMI/DP fix, and the default mixer setup change ot turn off the loopback - Lots of updates for ASoC Intel drivers, mostly board support and bug fixing, and to the NAU8825 driver - Work on generalizing bits of simple-card to allow more code sharing with the Renesas rsrc-card (which can't use simple-card due to DPCM) - Removal of the Odroid X2 driver due to replacement with simple-card - Support for several new Mediatek platforms and associated boards - New ASoC drivers for Allwinner A10, Analog Devices ADAU7002, Broadcom Cygnus, Cirrus Logic CS35L33 and CS53L30, Maxim MAX8960 and MAX98504, Realtek RT5514 and Wolfson WM8758" * tag 'sound-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (278 commits) sound: oss: Use kernel_read_file_from_path() for mod_firmware_load() ASoC: Intel: Skylake: Delete an unnecessary check before the function call "release_firmware" ASoC: Intel: Skylake: Fix NULL Pointer exception in dynamic_debug. ASoC: samsung: Specify DMA channels through struct snd_dmaengine_pcm_config ASoC: samsung: Fix error paths in the I2S driver's probe() ASoC: cs53l30: Fix bit shift issue of TDM mode ASoC: cs53l30: Fix a bug for TDM slot location validation ASoC: rockchip: correct the spdif clk ALSA: echoaudio: purge contradictions between dimension matrix members and total number of members ASoC: rsrc-card: use asoc_simple_card_parse_card_name() ASoC: rsrc-card: use asoc_simple_dai instead of rsrc_card_dai ASoC: rsrc-card: use asoc_simple_card_parse_dailink_name() ASoC: simple-card: use asoc_simple_card_parse_card_name() ASoC: simple-card-utils: add asoc_simple_card_parse_card_name() ASoC: simple-card: use asoc_simple_card_parse_dailink_name() ASoC: simple-card-utils: add asoc_simple_card_set_dailink_name() ASoC: nau8825: drop redundant idiom when converting integer to boolean ASoC: nau8825: jack connection decision with different insertion logic ASoC: mediatek: Add HDMI dai-links to the mt8173-rt5650 machine driver ASoC: mediatek: mt2701: fix non static symbol warning ...	2016-07-31 02:25:02 -07:00
Maarten ter Huurne	b2c7f5d9c9	regmap: cache: Fix num_reg_defaults computation from reg_defaults_raw In `3245d460` (regmap: cache: Fall back to register by register read for cache defaults) non-readable registers are skipped when initializing reg_defaults, but are still included in num_reg_defaults. So there can be uninitialized entries at the end of reg_defaults, which can cause problems when the register cache initializes from the full array. Fixed it by excluding non-readable registers from the count as well. Signed-off-by: Maarten ter Huurne <maarten@treewalker.org> Signed-off-by: Mark Brown <broonie@kernel.org>	2016-07-29 23:02:19 +01:00
Andy Lutomirski	d30dd8be06	mm: track NR_KERNEL_STACK in KiB instead of number of stacks Currently, NR_KERNEL_STACK tracks the number of kernel stacks in a zone. This only makes sense if each kernel stack exists entirely in one zone, and allowing vmapped stacks could break this assumption. Since frv has THREAD_SIZE < PAGE_SIZE, we need to track kernel stack allocations in a unit that divides both THREAD_SIZE and PAGE_SIZE on all architectures. Keep it simple and use KiB. Link: http://lkml.kernel.org/r/083c71e642c5fa5f1b6898902e1b2db7b48940d4.1468523549.git.luto@kernel.org Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Reviewed-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-28 16:07:41 -07:00
Mel Gorman	11fb998986	mm: move most file-based accounting to the node There are now a number of accounting oddities such as mapped file pages being accounted for on the node while the total number of file pages are accounted on the zone. This can be coped with to some extent but it's confusing so this patch moves the relevant file-based accounted. Due to throttling logic in the page allocator for reliable OOM detection, it is still necessary to track dirty and writeback pages on a per-zone basis. [mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting] Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-28 16:07:41 -07:00
Mel Gorman	4b9d0fab71	mm: rename NR_ANON_PAGES to NR_ANON_MAPPED NR_FILE_PAGES is the number of file pages. NR_FILE_MAPPED is the number of mapped file pages. NR_ANON_PAGES is the number of mapped anon pages. This is unhelpful naming as it's easy to confuse NR_FILE_MAPPED and NR_ANON_PAGES for mapped pages. This patch renames NR_ANON_PAGES so we have NR_FILE_PAGES is the number of file pages. NR_FILE_MAPPED is the number of mapped file pages. NR_ANON_MAPPED is the number of mapped anon pages. Link: http://lkml.kernel.org/r/1467970510-21195-19-git-send-email-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-28 16:07:41 -07:00
Mel Gorman	50658e2e04	mm: move page mapped accounting to the node Reclaim makes decisions based on the number of pages that are mapped but it's mixing node and zone information. Account NR_FILE_MAPPED and NR_ANON_PAGES pages on the node. Link: http://lkml.kernel.org/r/1467970510-21195-18-git-send-email-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-28 16:07:41 -07:00
Mel Gorman	599d0c954f	mm, vmscan: move LRU lists to node This moves the LRU lists from the zone to the node and related data such as counters, tracing, congestion tracking and writeback tracking. Unfortunately, due to reclaim and compaction retry logic, it is necessary to account for the number of LRU pages on both zone and node logic. Most reclaim logic is based on the node counters but the retry logic uses the zone counters which do not distinguish inactive and active sizes. It would be possible to leave the LRU counters on a per-zone basis but it's a heavier calculation across multiple cache lines that is much more frequent than the retry checks. Other than the LRU counters, this is mostly a mechanical patch but note that it introduces a number of anomalies. For example, the scans are per-zone but using per-node counters. We also mark a node as congested when a zone is congested. This causes weird problems that are fixed later but is easier to review. In the event that there is excessive overhead on 32-bit systems due to the nodes being on LRU then there are two potential solutions 1. Long-term isolation of highmem pages when reclaim is lowmem When pages are skipped, they are immediately added back onto the LRU list. If lowmem reclaim persisted for long periods of time, the same highmem pages get continually scanned. The idea would be that lowmem keeps those pages on a separate list until a reclaim for highmem pages arrives that splices the highmem pages back onto the LRU. It potentially could be implemented similar to the UNEVICTABLE list. That would reduce the skip rate with the potential corner case is that highmem pages have to be scanned and reclaimed to free lowmem slab pages. 2. Linear scan lowmem pages if the initial LRU shrink fails This will break LRU ordering but may be preferable and faster during memory pressure than skipping LRU pages. Link: http://lkml.kernel.org/r/1467970510-21195-4-git-send-email-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-28 16:07:41 -07:00
Mel Gorman	75ef718405	mm, vmstat: add infrastructure for per-node vmstats Patchset: "Move LRU page reclaim from zones to nodes v9" This series moves LRUs from the zones to the node. While this is a current rebase, the test results were based on mmotm as of June 23rd. Conceptually, this series is simple but there are a lot of details. Some of the broad motivations for this are; 1. The residency of a page partially depends on what zone the page was allocated from. This is partially combatted by the fair zone allocation policy but that is a partial solution that introduces overhead in the page allocator paths. 2. Currently, reclaim on node 0 behaves slightly different to node 1. For example, direct reclaim scans in zonelist order and reclaims even if the zone is over the high watermark regardless of the age of pages in that LRU. Kswapd on the other hand starts reclaim on the highest unbalanced zone. A difference in distribution of file/anon pages due to when they were allocated results can result in a difference in again. While the fair zone allocation policy mitigates some of the problems here, the page reclaim results on a multi-zone node will always be different to a single-zone node. it was scheduled on as a result. 3. kswapd and the page allocator scan zones in the opposite order to avoid interfering with each other but it's sensitive to timing. This mitigates the page allocator using pages that were allocated very recently in the ideal case but it's sensitive to timing. When kswapd is allocating from lower zones then it's great but during the rebalancing of the highest zone, the page allocator and kswapd interfere with each other. It's worse if the highest zone is small and difficult to balance. 4. slab shrinkers are node-based which makes it harder to identify the exact relationship between slab reclaim and LRU reclaim. The reason we have zone-based reclaim is that we used to have large highmem zones in common configurations and it was necessary to quickly find ZONE_NORMAL pages for reclaim. Today, this is much less of a concern as machines with lots of memory will (or should) use 64-bit kernels. Combinations of 32-bit hardware and 64-bit hardware are rare. Machines that do use highmem should have relatively low highmem:lowmem ratios than we worried about in the past. Conceptually, moving to node LRUs should be easier to understand. The page allocator plays fewer tricks to game reclaim and reclaim behaves similarly on all nodes. The series has been tested on a 16 core UMA machine and a 2-socket 48 core NUMA machine. The UMA results are presented in most cases as the NUMA machine behaved similarly. pagealloc --------- This is a microbenchmark that shows the benefit of removing the fair zone allocation policy. It was tested uip to order-4 but only orders 0 and 1 are shown as the other orders were comparable. 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 nodelru-v9 Min total-odr0-1 490.00 ( 0.00%) 457.00 ( 6.73%) Min total-odr0-2 347.00 ( 0.00%) 329.00 ( 5.19%) Min total-odr0-4 288.00 ( 0.00%) 273.00 ( 5.21%) Min total-odr0-8 251.00 ( 0.00%) 239.00 ( 4.78%) Min total-odr0-16 234.00 ( 0.00%) 222.00 ( 5.13%) Min total-odr0-32 223.00 ( 0.00%) 211.00 ( 5.38%) Min total-odr0-64 217.00 ( 0.00%) 208.00 ( 4.15%) Min total-odr0-128 214.00 ( 0.00%) 204.00 ( 4.67%) Min total-odr0-256 250.00 ( 0.00%) 230.00 ( 8.00%) Min total-odr0-512 271.00 ( 0.00%) 269.00 ( 0.74%) Min total-odr0-1024 291.00 ( 0.00%) 282.00 ( 3.09%) Min total-odr0-2048 303.00 ( 0.00%) 296.00 ( 2.31%) Min total-odr0-4096 311.00 ( 0.00%) 309.00 ( 0.64%) Min total-odr0-8192 316.00 ( 0.00%) 314.00 ( 0.63%) Min total-odr0-16384 317.00 ( 0.00%) 315.00 ( 0.63%) Min total-odr1-1 742.00 ( 0.00%) 712.00 ( 4.04%) Min total-odr1-2 562.00 ( 0.00%) 530.00 ( 5.69%) Min total-odr1-4 457.00 ( 0.00%) 433.00 ( 5.25%) Min total-odr1-8 411.00 ( 0.00%) 381.00 ( 7.30%) Min total-odr1-16 381.00 ( 0.00%) 356.00 ( 6.56%) Min total-odr1-32 372.00 ( 0.00%) 346.00 ( 6.99%) Min total-odr1-64 372.00 ( 0.00%) 343.00 ( 7.80%) Min total-odr1-128 375.00 ( 0.00%) 351.00 ( 6.40%) Min total-odr1-256 379.00 ( 0.00%) 351.00 ( 7.39%) Min total-odr1-512 385.00 ( 0.00%) 355.00 ( 7.79%) Min total-odr1-1024 386.00 ( 0.00%) 358.00 ( 7.25%) Min total-odr1-2048 390.00 ( 0.00%) 362.00 ( 7.18%) Min total-odr1-4096 390.00 ( 0.00%) 362.00 ( 7.18%) Min total-odr1-8192 388.00 ( 0.00%) 363.00 ( 6.44%) This shows a steady improvement throughout. The primary benefit is from reduced system CPU usage which is obvious from the overall times; 4.7.0-rc4 4.7.0-rc4 mmotm-20160623nodelru-v8 User 189.19 191.80 System 2604.45 2533.56 Elapsed 2855.30 2786.39 The vmstats also showed that the fair zone allocation policy was definitely removed as can be seen here; 4.7.0-rc3 4.7.0-rc3 mmotm-20160623 nodelru-v8 DMA32 allocs 28794729769 0 Normal allocs 48432501431 77227309877 Movable allocs 0 0 tiobench on ext4 ---------------- tiobench is a benchmark that artifically benefits if old pages remain resident while new pages get reclaimed. The fair zone allocation policy mitigates this problem so pages age fairly. While the benchmark has problems, it is important that tiobench performance remains constant as it implies that page aging problems that the fair zone allocation policy fixes are not re-introduced. 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 nodelru-v9 Min PotentialReadSpeed 89.65 ( 0.00%) 90.21 ( 0.62%) Min SeqRead-MB/sec-1 82.68 ( 0.00%) 82.01 ( -0.81%) Min SeqRead-MB/sec-2 72.76 ( 0.00%) 72.07 ( -0.95%) Min SeqRead-MB/sec-4 75.13 ( 0.00%) 74.92 ( -0.28%) Min SeqRead-MB/sec-8 64.91 ( 0.00%) 65.19 ( 0.43%) Min SeqRead-MB/sec-16 62.24 ( 0.00%) 62.22 ( -0.03%) Min RandRead-MB/sec-1 0.88 ( 0.00%) 0.88 ( 0.00%) Min RandRead-MB/sec-2 0.95 ( 0.00%) 0.92 ( -3.16%) Min RandRead-MB/sec-4 1.43 ( 0.00%) 1.34 ( -6.29%) Min RandRead-MB/sec-8 1.61 ( 0.00%) 1.60 ( -0.62%) Min RandRead-MB/sec-16 1.80 ( 0.00%) 1.90 ( 5.56%) Min SeqWrite-MB/sec-1 76.41 ( 0.00%) 76.85 ( 0.58%) Min SeqWrite-MB/sec-2 74.11 ( 0.00%) 73.54 ( -0.77%) Min SeqWrite-MB/sec-4 80.05 ( 0.00%) 80.13 ( 0.10%) Min SeqWrite-MB/sec-8 72.88 ( 0.00%) 73.20 ( 0.44%) Min SeqWrite-MB/sec-16 75.91 ( 0.00%) 76.44 ( 0.70%) Min RandWrite-MB/sec-1 1.18 ( 0.00%) 1.14 ( -3.39%) Min RandWrite-MB/sec-2 1.02 ( 0.00%) 1.03 ( 0.98%) Min RandWrite-MB/sec-4 1.05 ( 0.00%) 0.98 ( -6.67%) Min RandWrite-MB/sec-8 0.89 ( 0.00%) 0.92 ( 3.37%) Min RandWrite-MB/sec-16 0.92 ( 0.00%) 0.93 ( 1.09%) 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 approx-v9 User 645.72 525.90 System 403.85 331.75 Elapsed 6795.36 6783.67 This shows that the series has little or not impact on tiobench which is desirable and a reduction in system CPU usage. It indicates that the fair zone allocation policy was removed in a manner that didn't reintroduce one class of page aging bug. There were only minor differences in overall reclaim activity 4.7.0-rc4 4.7.0-rc4 mmotm-20160623nodelru-v8 Minor Faults 645838 647465 Major Faults 573 640 Swap Ins 0 0 Swap Outs 0 0 DMA allocs 0 0 DMA32 allocs 46041453 44190646 Normal allocs 78053072 79887245 Movable allocs 0 0 Allocation stalls 24 67 Stall zone DMA 0 0 Stall zone DMA32 0 0 Stall zone Normal 0 2 Stall zone HighMem 0 0 Stall zone Movable 0 65 Direct pages scanned 10969 30609 Kswapd pages scanned 93375144 93492094 Kswapd pages reclaimed 93372243 93489370 Direct pages reclaimed 10969 30609 Kswapd efficiency 99% 99% Kswapd velocity 13741.015 13781.934 Direct efficiency 100% 100% Direct velocity 1.614 4.512 Percentage direct scans 0% 0% kswapd activity was roughly comparable. There were differences in direct reclaim activity but negligible in the context of the overall workload (velocity of 4 pages per second with the patches applied, 1.6 pages per second in the baseline kernel). pgbench read-only large configuration on ext4 --------------------------------------------- pgbench is a database benchmark that can be sensitive to page reclaim decisions. This also checks if removing the fair zone allocation policy is safe pgbench Transactions 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 nodelru-v8 Hmean 1 188.26 ( 0.00%) 189.78 ( 0.81%) Hmean 5 330.66 ( 0.00%) 328.69 ( -0.59%) Hmean 12 370.32 ( 0.00%) 380.72 ( 2.81%) Hmean 21 368.89 ( 0.00%) 369.00 ( 0.03%) Hmean 30 382.14 ( 0.00%) 360.89 ( -5.56%) Hmean 32 428.87 ( 0.00%) 432.96 ( 0.95%) Negligible differences again. As with tiobench, overall reclaim activity was comparable. bonnie++ on ext4 ---------------- No interesting performance difference, negligible differences on reclaim stats. paralleldd on ext4 ------------------ This workload uses varying numbers of dd instances to read large amounts of data from disk. 4.7.0-rc3 4.7.0-rc3 mmotm-20160623 nodelru-v9 Amean Elapsd-1 186.04 ( 0.00%) 189.41 ( -1.82%) Amean Elapsd-3 192.27 ( 0.00%) 191.38 ( 0.46%) Amean Elapsd-5 185.21 ( 0.00%) 182.75 ( 1.33%) Amean Elapsd-7 183.71 ( 0.00%) 182.11 ( 0.87%) Amean Elapsd-12 180.96 ( 0.00%) 181.58 ( -0.35%) Amean Elapsd-16 181.36 ( 0.00%) 183.72 ( -1.30%) 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 nodelru-v9 User 1548.01 1552.44 System 8609.71 8515.08 Elapsed 3587.10 3594.54 There is little or no change in performance but some drop in system CPU usage. 4.7.0-rc3 4.7.0-rc3 mmotm-20160623 nodelru-v9 Minor Faults 362662 367360 Major Faults 1204 1143 Swap Ins 22 0 Swap Outs 2855 1029 DMA allocs 0 0 DMA32 allocs 31409797 28837521 Normal allocs 46611853 49231282 Movable allocs 0 0 Direct pages scanned 0 0 Kswapd pages scanned 40845270 40869088 Kswapd pages reclaimed 40830976 40855294 Direct pages reclaimed 0 0 Kswapd efficiency 99% 99% Kswapd velocity 11386.711 11369.769 Direct efficiency 100% 100% Direct velocity 0.000 0.000 Percentage direct scans 0% 0% Page writes by reclaim 2855 1029 Page writes file 0 0 Page writes anon 2855 1029 Page reclaim immediate 771 1628 Sector Reads 293312636 293536360 Sector Writes 18213568 18186480 Page rescued immediate 0 0 Slabs scanned 128257 132747 Direct inode steals 181 56 Kswapd inode steals 59 1131 It basically shows that kswapd was active at roughly the same rate in both kernels. There was also comparable slab scanning activity and direct reclaim was avoided in both cases. There appears to be a large difference in numbers of inodes reclaimed but the workload has few active inodes and is likely a timing artifact. stutter ------- stutter simulates a simple workload. One part uses a lot of anonymous memory, a second measures mmap latency and a third copies a large file. The primary metric is checking for mmap latency. stutter 4.7.0-rc4 4.7.0-rc4 mmotm-20160623 nodelru-v8 Min mmap 16.6283 ( 0.00%) 13.4258 ( 19.26%) 1st-qrtle mmap 54.7570 ( 0.00%) 34.9121 ( 36.24%) 2nd-qrtle mmap 57.3163 ( 0.00%) 46.1147 ( 19.54%) 3rd-qrtle mmap 58.9976 ( 0.00%) 47.1882 ( 20.02%) Max-90% mmap 59.7433 ( 0.00%) 47.4453 ( 20.58%) Max-93% mmap 60.1298 ( 0.00%) 47.6037 ( 20.83%) Max-95% mmap 73.4112 ( 0.00%) 82.8719 (-12.89%) Max-99% mmap 92.8542 ( 0.00%) 88.8870 ( 4.27%) Max mmap 1440.6569 ( 0.00%) 121.4201 ( 91.57%) Mean mmap 59.3493 ( 0.00%) 42.2991 ( 28.73%) Best99%Mean mmap 57.2121 ( 0.00%) 41.8207 ( 26.90%) Best95%Mean mmap 55.9113 ( 0.00%) 39.9620 ( 28.53%) Best90%Mean mmap 55.6199 ( 0.00%) 39.3124 ( 29.32%) Best50%Mean mmap 53.2183 ( 0.00%) 33.1307 ( 37.75%) Best10%Mean mmap 45.9842 ( 0.00%) 20.4040 ( 55.63%) Best5%Mean mmap 43.2256 ( 0.00%) 17.9654 ( 58.44%) Best1%Mean mmap 32.9388 ( 0.00%) 16.6875 ( 49.34%) This shows a number of improvements with the worst-case outlier greatly improved. Some of the vmstats are interesting 4.7.0-rc4 4.7.0-rc4 mmotm-20160623nodelru-v8 Swap Ins 163 502 Swap Outs 0 0 DMA allocs 0 0 DMA32 allocs 618719206 1381662383 Normal allocs 891235743 564138421 Movable allocs 0 0 Allocation stalls 2603 1 Direct pages scanned 216787 2 Kswapd pages scanned 50719775 41778378 Kswapd pages reclaimed 41541765 41777639 Direct pages reclaimed 209159 0 Kswapd efficiency 81% 99% Kswapd velocity 16859.554 14329.059 Direct efficiency 96% 0% Direct velocity 72.061 0.001 Percentage direct scans 0% 0% Page writes by reclaim 6215049 0 Page writes file 6215049 0 Page writes anon 0 0 Page reclaim immediate 70673 90 Sector Reads 81940800 81680456 Sector Writes 100158984 98816036 Page rescued immediate 0 0 Slabs scanned 1366954 22683 While this is not guaranteed in all cases, this particular test showed a large reduction in direct reclaim activity. It's also worth noting that no page writes were issued from reclaim context. This series is not without its hazards. There are at least three areas that I'm concerned with even though I could not reproduce any problems in that area. 1. Reclaim/compaction is going to be affected because the amount of reclaim is no longer targetted at a specific zone. Compaction works on a per-zone basis so there is no guarantee that reclaiming a few THP's worth page pages will have a positive impact on compaction success rates. 2. The Slab/LRU reclaim ratio is affected because the frequency the shrinkers are called is now different. This may or may not be a problem but if it is, it'll be because shrinkers are not called enough and some balancing is required. 3. The anon/file reclaim ratio may be affected. Pages about to be dirtied are distributed between zones and the fair zone allocation policy used to do something very similar for anon. The distribution is now different but not necessarily in any way that matters but it's still worth bearing in mind. VM statistic counters for reclaim decisions are zone-based. If the kernel is to reclaim on a per-node basis then we need to track per-node statistics but there is no infrastructure for that. The most notable change is that the old node_page_state is renamed to sum_zone_node_page_state. The new node_page_state takes a pglist_data and uses per-node stats but none exist yet. There is some renaming such as vm_stat to vm_zone_stat and the addition of vm_node_stat and the renaming of mod_state to mod_zone_state. Otherwise, this is mostly a mechanical patch with no functional change. There is a lot of similarity between the node and zone helpers which is unfortunate but there was no obvious way of reusing the code and maintaining type safety. Link: http://lkml.kernel.org/r/1467970510-21195-2-git-send-email-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-28 16:07:41 -07:00
Jisheng Zhang	067b7ce083	PM / OPP: optimize dev_pm_opp_set_rate() performance a bit In dev_pm_opp_set_rate(), _find_opp_table() is called 4 times: once by _get_opp_clk(), once by dev_pm_opp_set_rate() itself, and twice by dev_pm_opp_find_freq_ceil(). If there are several opp_tables in the system, three times of opp table finding is a big waste. This patch reduced the call of _find_opp_table() to twice. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-28 23:54:16 +02:00
Markus Elfring	4f48ec8a1c	PM-wakeup: Delete unnecessary checks before three function calls The following functions test whether their argument is NULL and then return immediately. * dev_pm_arm_wake_irq * dev_pm_disarm_wake_irq * wakeup_source_unregister Thus the test around the calls is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Pavel Machek <pavel@ucw.cz> [ rjw: Minor whitespace adjustments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-07-28 23:51:04 +02:00
Linus Torvalds	0e06f5c0de	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: - a few misc bits - ocfs2 - most(?) of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (125 commits) thp: fix comments of __pmd_trans_huge_lock() cgroup: remove unnecessary 0 check from css_from_id() cgroup: fix idr leak for the first cgroup root mm: memcontrol: fix documentation for compound parameter mm: memcontrol: remove BUG_ON in uncharge_list mm: fix build warnings in <linux/compaction.h> mm, thp: convert from optimistic swapin collapsing to conservative mm, thp: fix comment inconsistency for swapin readahead functions thp: update Documentation/{vm/transhuge,filesystems/proc}.txt shmem: split huge pages beyond i_size under memory pressure thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE khugepaged: add support of collapse for tmpfs/shmem pages shmem: make shmem_inode_info::lock irq-safe khugepaged: move up_read(mmap_sem) out of khugepaged_alloc_page() thp: extract khugepaged from mm/huge_memory.c shmem, thp: respect MADV_{NO,}HUGEPAGE for file mappings shmem: add huge pages support shmem: get_unmapped_area align huge page shmem: prepare huge= mount option and sysfs knob mm, rmap: account shmem thp pages ...	2016-07-26 19:55:54 -07:00
Linus Torvalds	ae9799975c	regmap: Updates for v4.8 Several small updates and API enhancements: - Provide transparent unrolling of bulk writes into individual writes so they can be used with devices without raw formatting. - Fix compatibility between I2C controllers supporting block commands and devices with more than 8 bit wide registers. - Add some helpers for iopoll-like functionality and workarounds for weird interrupt controllers. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXljTLAAoJECTWi3JdVIfQMSMH/1MRirJkwIds0SRbWcnG9hOe nMfUhkuFuCZJN82GwOh7YbAxLB1pGXKsGZq1XGMYFvwJawEizR3o83fgcRy1LQmG KOQHO1O/CWQqRo1ntGl33H2thhq6e67y6lpeMBhdMvp3VjqdyzlILNSva6iWbW/R N0iFSSKd4VdHlg/o+R3k7EzWCLZ4FZI7B9/5OjxWkoUR+tloNI8rR8Ecwl6+Q37K 8+yo6R0Ntx/JXB7JvA5Y8vqPdUGi68+XpcnaAcpM5DaL95+lqzduWCjKfjG5oatZ 3o/ORRVi+WyD95/NLXwKFMvSDSGokVDrU87HNgKph+f4sNF5TIoC/E+JhPUQBSU= =7BlM -----END PGP SIGNATURE----- Merge tag 'regmap-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap updates from Mark Brown: "Several small updates and API enhancements: - provide transparent unrolling of bulk writes into individual writes so they can be used with devices without raw formatting. - fix compatibility between I2C controllers supporting block commands and devices with more than 8 bit wide registers. - add some helpers for iopoll-like functionality and workarounds for weird interrupt controllers" * tag 'regmap-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: add iopoll-like polling macro regmap: Support bulk writes for devices without raw formatting regmap-i2c: Use i2c block command only if register value width is 8 bit regmap: irq: Add support to call client specific pre/post interrupt service regmap: Add file patterns for regmap device tree bindings	2016-07-26 19:24:33 -07:00
Linus Torvalds	6453dbdda3	Power management material for v4.8-rc1 - Rework the cpufreq governor interface to make it more straightforward and modify the conservative governor to avoid using transition notifications (Rafael Wysocki). - Rework the handling of frequency tables by the cpufreq core to make it more efficient (Viresh Kumar). - Modify the schedutil governor to reduce the number of wakeups it causes to occur in cases when the CPU frequency doesn't need to be changed (Steve Muckle, Viresh Kumar). - Fix some minor issues and clean up code in the cpufreq core and governors (Rafael Wysocki, Viresh Kumar). - Add Intel Broxton support to the intel_pstate driver (Srinivas Pandruvada). - Fix problems related to the config TDP feature and to the validity of the MSR_HWP_INTERRUPT register in intel_pstate (Jan Kiszka, Srinivas Pandruvada). - Make intel_pstate update the cpu_frequency tracepoint even if the frequency doesn't change to avoid confusing powertop (Rafael Wysocki). - Clean up the usage of __init/__initdata in intel_pstate, mark some of its internal variables as __read_mostly and drop an unused structure element from it (Jisheng Zhang, Carsten Emde). - Clean up the usage of some duplicate MSR symbols in intel_pstate and turbostat (Srinivas Pandruvada). - Update/fix the powernv, s3c24xx and mvebu cpufreq drivers (Akshay Adiga, Viresh Kumar, Ben Dooks). - Fix a regression (introduced during the 4.5 cycle) in the pcc-cpufreq driver by reverting the problematic commit (Andreas Herrmann). - Add support for Intel Denverton to intel_idle, clean up Broxton support in it and make it explicitly non-modular (Jacob Pan, Jan Beulich, Paul Gortmaker). - Add support for Denverton and Ivy Bridge server to the Intel RAPL power capping driver and make it more careful about the handing of MSRs that may not be present (Jacob Pan, Xiaolong Wang). - Fix resume from hibernation on x86-64 by making the CPU offline during resume avoid using MONITOR/MWAIT in the "play dead" loop which may lead to an inadvertent "revival" of a "dead" CPU and a page fault leading to a kernel crash from it (Rafael Wysocki). - Make memory management during resume from hibernation more straightforward (Rafael Wysocki). - Add debug features that should help to detect problems related to hibernation and resume from it (Rafael Wysocki, Chen Yu). - Clean up hibernation core somewhat (Rafael Wysocki). - Prevent KASAN from instrumenting the hibernation core which leads to large numbers of false-positives from it (James Morse). - Prevent PM (hibernate and suspend) notifiers from being called during the cleanup phase if they have not been called during the corresponding preparation phase which is possible if one of the other notifiers returns an error at that time (Lianwei Wang). - Improve suspend-related debug printout in the tasks freezer and clean up suspend-related console handling (Roger Lu, Borislav Petkov). - Update the AnalyzeSuspend script in the kernel sources to version 4.2 (Todd Brandt). - Modify the generic power domains framework to make it handle system suspend/resume better (Ulf Hansson). - Make the runtime PM framework avoid resuming devices synchronously when user space changes the runtime PM settings for them and improve its error reporting (Rafael Wysocki, Linus Walleij). - Fix error paths in devfreq drivers (exynos, exynos-ppmu, exynos-bus) and in the core, make some devfreq code explicitly non-modular and change some of it into tristate (Bartlomiej Zolnierkiewicz, Peter Chen, Paul Gortmaker). - Add DT support to the generic PM clocks management code and make it export some more symbols (Jon Hunter, Paul Gortmaker). - Make the PCI PM core code slightly more robust against possible driver errors (Andy Shevchenko). - Make it possible to change DESTDIR and PREFIX in turbostat (Andy Shevchenko). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJXl7/dAAoJEILEb/54YlRx+VgQAIQJOWvxKew3Yl02c/sdj9OT 5VNnFrzGzdcAPofvvG9qGq8B0Es1vYehJpwwOB21ri8EvYv0riIiU1yrqslObojQ oaZOkSBpbIoKjGR4CpYA/A+feE+8EqIBdPGd+lx5a6oRdUi7tRVHBG9lyLO3FB/i jan1q8dMpZsmu+Y+rVVHGnCVuIlIEqr2ZnZfCwDAulO2Arp/QFAh4kH08ELATvrl bkPa25vq7/VMP/vCDzrfZKD5mUuKogIRu/J5wx4py1nE+FB35cKKyqBOgklLwAeY UI8vjDhr/myNUs54AZlktOkq47TCYvjvhX9kmOxBjuWqFbRusU012IRek1fYPRIV ZqbkqNX7UEVQwunAEg9AyFwyzEtOht93dQDT5RLEd4QzKuM76gmHpLeTGGMzE+nu FnmF9JGl4DVwqpZl9yU2+hR2Mt3bP8OF8qYmNiGUB3KO4emPslhSd+6y8liA5Bx2 SJf0Gb//vaHCh3/uMnwAonYPqRkZvBLOMwuL1VUjNQfRMnQtDdgHMYB1aT/EglPA 8ww6j4J8rVRLAxvYQ3UEmNA/vBNclKXblRR18+JddEZP9/oX0ATfwnCCUpr839uk xxyQhrm4/AI60+PHWCX4GG80YrKdOGTkF7LXCQZanVWjjuyF17rufegZ2YWLT07v JU1Cmumfdy2jJluT8xsR =uVGz -----END PGP SIGNATURE----- Merge tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "Again, the majority of changes go into the cpufreq subsystem, but there are no big features this time. The cpufreq changes that stand out somewhat are the governor interface rework and improvements related to the handling of frequency tables. Apart from those, there are fixes and new device/CPU IDs in drivers, cleanups and an improvement of the new schedutil governor. Next, there are some changes in the hibernation core, including a fix for a nasty problem related to the MONITOR/MWAIT usage by CPU offline during resume from hibernation, a few core improvements related to memory management during resume, a couple of additional debug features and cleanups. Finally, we have some fixes and cleanups in the devfreq subsystem, generic power domains framework improvements related to system suspend/resume, support for some new chips in intel_idle and in the power capping RAPL driver, a new version of the AnalyzeSuspend utility and some assorted fixes and cleanups. Specifics: - Rework the cpufreq governor interface to make it more straightforward and modify the conservative governor to avoid using transition notifications (Rafael Wysocki). - Rework the handling of frequency tables by the cpufreq core to make it more efficient (Viresh Kumar). - Modify the schedutil governor to reduce the number of wakeups it causes to occur in cases when the CPU frequency doesn't need to be changed (Steve Muckle, Viresh Kumar). - Fix some minor issues and clean up code in the cpufreq core and governors (Rafael Wysocki, Viresh Kumar). - Add Intel Broxton support to the intel_pstate driver (Srinivas Pandruvada). - Fix problems related to the config TDP feature and to the validity of the MSR_HWP_INTERRUPT register in intel_pstate (Jan Kiszka, Srinivas Pandruvada). - Make intel_pstate update the cpu_frequency tracepoint even if the frequency doesn't change to avoid confusing powertop (Rafael Wysocki). - Clean up the usage of __init/__initdata in intel_pstate, mark some of its internal variables as __read_mostly and drop an unused structure element from it (Jisheng Zhang, Carsten Emde). - Clean up the usage of some duplicate MSR symbols in intel_pstate and turbostat (Srinivas Pandruvada). - Update/fix the powernv, s3c24xx and mvebu cpufreq drivers (Akshay Adiga, Viresh Kumar, Ben Dooks). - Fix a regression (introduced during the 4.5 cycle) in the pcc-cpufreq driver by reverting the problematic commit (Andreas Herrmann). - Add support for Intel Denverton to intel_idle, clean up Broxton support in it and make it explicitly non-modular (Jacob Pan, Jan Beulich, Paul Gortmaker). - Add support for Denverton and Ivy Bridge server to the Intel RAPL power capping driver and make it more careful about the handing of MSRs that may not be present (Jacob Pan, Xiaolong Wang). - Fix resume from hibernation on x86-64 by making the CPU offline during resume avoid using MONITOR/MWAIT in the "play dead" loop which may lead to an inadvertent "revival" of a "dead" CPU and a page fault leading to a kernel crash from it (Rafael Wysocki). - Make memory management during resume from hibernation more straightforward (Rafael Wysocki). - Add debug features that should help to detect problems related to hibernation and resume from it (Rafael Wysocki, Chen Yu). - Clean up hibernation core somewhat (Rafael Wysocki). - Prevent KASAN from instrumenting the hibernation core which leads to large numbers of false-positives from it (James Morse). - Prevent PM (hibernate and suspend) notifiers from being called during the cleanup phase if they have not been called during the corresponding preparation phase which is possible if one of the other notifiers returns an error at that time (Lianwei Wang). - Improve suspend-related debug printout in the tasks freezer and clean up suspend-related console handling (Roger Lu, Borislav Petkov). - Update the AnalyzeSuspend script in the kernel sources to version 4.2 (Todd Brandt). - Modify the generic power domains framework to make it handle system suspend/resume better (Ulf Hansson). - Make the runtime PM framework avoid resuming devices synchronously when user space changes the runtime PM settings for them and improve its error reporting (Rafael Wysocki, Linus Walleij). - Fix error paths in devfreq drivers (exynos, exynos-ppmu, exynos-bus) and in the core, make some devfreq code explicitly non-modular and change some of it into tristate (Bartlomiej Zolnierkiewicz, Peter Chen, Paul Gortmaker). - Add DT support to the generic PM clocks management code and make it export some more symbols (Jon Hunter, Paul Gortmaker). - Make the PCI PM core code slightly more robust against possible driver errors (Andy Shevchenko). - Make it possible to change DESTDIR and PREFIX in turbostat (Andy Shevchenko)" * tag 'pm-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits) Revert "cpufreq: pcc-cpufreq: update default value of cpuinfo_transition_latency" PM / hibernate: Introduce test_resume mode for hibernation cpufreq: export cpufreq_driver_resolve_freq() cpufreq: Disallow ->resolve_freq() for drivers providing ->target_index() PCI / PM: check all fields in pci_set_platform_pm() cpufreq: acpi-cpufreq: use cached frequency mapping when possible cpufreq: schedutil: map raw required frequency to driver frequency cpufreq: add cpufreq_driver_resolve_freq() cpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPT intel_pstate: Update cpu_frequency tracepoint every time cpufreq: intel_pstate: clean remnant struct element PM / tools: scripts: AnalyzeSuspend v4.2 x86 / hibernate: Use hlt_play_dead() when resuming from hibernation cpufreq: powernv: Replacing pstate_id with frequency table index intel_pstate: Fix MSR_CONFIG_TDP_x addressing in core_get_max_pstate() PM / hibernate: Image data protection during restoration PM / hibernate: Add missing braces in __register_nosave_region() PM / hibernate: Clean up comments in snapshot.c PM / hibernate: Clean up function headers in snapshot.c PM / hibernate: Add missing braces in hibernate_setup() ...	2016-07-26 17:29:07 -07:00
Kirill A. Shutemov	65c453778a	mm, rmap: account shmem thp pages Let's add ShmemHugePages and ShmemPmdMapped fields into meminfo and smaps. It indicates how many times we allocate and map shmem THP. NR_ANON_TRANSPARENT_HUGEPAGES is renamed to NR_ANON_THPS. Link: http://lkml.kernel.org/r/1466021202-61880-27-git-send-email-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-26 16:19:19 -07:00
Reza Arbab	a371d9f1cc	memory-hotplug: use zone_can_shift() for sysfs valid_zones attribute Since zone_can_shift() is being used to validate the target zone during onlining, it should also be used to determine the content of valid_zones. Link: http://lkml.kernel.org/r/1462816419-4479-4-git-send-email-arbab@linux.vnet.ibm.com Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> Reviewd-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Daniel Kiper <daniel.kiper@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Andrew Banman <abanman@sgi.com> Cc: Chen Yucong <slaoub@gmail.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Zhang Zhen <zhenzhang.zhang@huawei.com> Cc: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-07-26 16:19:19 -07:00
Linus Torvalds	015cd867e5	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "There are a couple of new things for s390 with this merge request: - a new scheduling domain "drawer" is added to reflect the unusual topology found on z13 machines. Performance tests showed up to 8 percent gain with the additional domain. - the new crc-32 checksum crypto module uses the vector-galois-field multiply and sum SIMD instruction to speed up crc-32 and crc-32c. - proper __ro_after_init support, this requires RO_AFTER_INIT_DATA in the generic vmlinux.lds linker script definitions. - kcov instrumentation support. A prerequisite for that is the inline assembly basic block cleanup, which is the reason for the net/iucv/iucv.c change. - support for 2GB pages is added to the hugetlbfs backend. Then there are two removals: - the oprofile hardware sampling support is dead code and is removed. The oprofile user space uses the perf interface nowadays. - the ETR clock synchronization is removed, this has been superseeded be the STP clock synchronization. And it always has been "interesting" code.. And the usual bug fixes and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (82 commits) s390/pci: Delete an unnecessary check before the function call "pci_dev_put" s390/smp: clean up a condition s390/cio/chp : Remove deprecated create_singlethread_workqueue s390/chsc: improve channel path descriptor determination s390/chsc: sanitize fmt check for chp_desc determination s390/cio: make fmt1 channel path descriptor optional s390/chsc: fix ioctl CHSC_INFO_CU command s390/cio/device_ops: fix kernel doc s390/cio: allow to reset channel measurement block s390/console: Make preferred console handling more consistent s390/mm: fix gmap tlb flush issues s390/mm: add support for 2GB hugepages s390: have unique symbol for __switch_to address s390/cpuinfo: show maximum thread id s390/ptrace: clarify bits in the per_struct s390: stack address vs thread_info s390: remove pointless load within __switch_to s390: enable kcov support s390/cpumf: use basic block for ecctr inline assembly s390/hypfs: use basic block for diag inline assembly ...	2016-07-26 12:22:51 -07:00
Rafael J. Wysocki	fa70db3f19	Merge branches 'pm-core', 'pm-clk', 'pm-domains' and 'pm-pci' * pm-core: PM / runtime: Asynchronous "idle" in pm_runtime_allow() PM / runtime: print error when activating a child to unactive parent * pm-clk: PM / clk: Add support for adding a specific clock from device-tree PM / clk: export symbols for existing pm_clk_<...> API fcns * pm-domains: PM / Domains: Convert pm_genpd_init() to return an error code PM / Domains: Stop/start devices during system PM suspend/resume in genpd PM / Domains: Allow runtime PM during system PM phases PM / Runtime: Avoid resuming devices again in pm_runtime_force_resume() PM / Domains: Remove redundant pm_request_idle() call in genpd PM / Domains: Remove redundant wrapper functions for system PM PM / Domains: Allow genpd to power on during system PM phases * pm-pci: PCI / PM: check all fields in pci_set_platform_pm()	2016-07-25 13:45:27 +02:00
Mark Brown	3ceeda1cbe	Merge remote-tracking branches 'asoc/topic/cs53l30', 'asoc/topic/cygnus', 'asoc/topic/da7219' and 'asoc/topic/davinci' into asoc-next	2016-07-24 22:07:31 +01:00
Mark Brown	efeb1a3ab9	Merge remote-tracking branches 'regmap/topic/bulk', 'regmap/topic/i2c', 'regmap/topic/iopoll', 'regmap/topic/irq' and 'regmap/topic/maintainers' into regmap-next	2016-07-15 13:44:47 +01:00
Rafael J. Wysocki	fe7450b05f	PM / runtime: Asynchronous "idle" in pm_runtime_allow() Arjan reports that it takes a relatively long time to enable runtime PM for multiple devices at system startup, because all writes to the "control" attribute in sysfs are handled synchronously and if the device is suspended as a result of the write, it will block until that operation is complete. That may be avoided by passing the RPM_ASYNC flag to rpm_idle() in pm_runtime_allow() which will make it execute the device's "idle" callback asynchronously, so writes to "control" changing it from "on" to "auto" will return without waiting. Reported-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com>	2016-07-02 01:50:39 +02:00
Chen-Yu Tsai	5bf75b4497	regmap: Support bulk writes for devices without raw formatting When doing a bulk writes from a device which lacks raw I/O support we fall back to doing register at a time reads but we still use the raw formatters in order to render the data into the word size used by the device (since bulk reads still operate on the device word size rather than unsigned ints). This means that devices without raw formatting such as those that provide reg_read() are not supported. Provide handling for them by copying the values read into native endian values of the appropriate size. This complements commit `d5b98eb124` ("regmap: Support bulk reads for devices without raw formatting"). Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Mark Brown <broonie@kernel.org>	2016-06-29 19:48:00 +01:00
Ulf Hansson	7eb231c337	PM / Domains: Convert pm_genpd_init() to return an error code The are already cases when pm_genpd_init() can fail. Currently we hide the failures instead of propagating an error code, which is a better method. Moreover, to prepare for future changes like moving away from using a fixed array-size of the struct genpd_power_state, to instead dynamically allocate data for it, the pm_genpd_init() API needs to be able to return an error code, as allocation can fail. Current users of the pm_genpd_init() is thus requested to start dealing with error codes. In the transition phase, users will have to live with only error messages being printed to log. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Acked-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-29 02:15:19 +02:00
Jon Hunter	498b5fdd40	PM / clk: Add support for adding a specific clock from device-tree Some drivers using the PM clocks framework need to add specific clocks from device-tree using a name by calling the functions of_clk_get_by_name() and then pm_clk_add_clk(). Rather than having drivers call both functions, add a helper function of_pm_clk_add_clk() that will call these functions so drivers can call a single function to add a specific clock from device-tree. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-28 00:42:10 +02:00
Linus Walleij	71723f9546	PM / runtime: print error when activating a child to unactive parent The code currently silently bails out with -EBUSY if you try to activate a child to an inactive parent. This typically happens when you have a runtime suspended parent and runtime resume your child, but forgot to set .ignore_children on the parent to true with pm_suspend_ignore_children(dev). Silently ignoring this error is not good as it gives rise to other strange behaviour like double-resume of devices after silently bailing out of the .runtime_resume() callback. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-28 00:40:30 +02:00
Adam Thomson	613e97218c	device property: Add function to search for named child of device For device nodes in both DT and ACPI, it possible to have named child nodes which contain properties (an existing example being gpio-leds). This adds a function to find a named child node for a device which can be used by drivers for property retrieval. For DT data node name matching, of_node_cmp() and similar functions are made available outside of CONFIG_OF block so the new function can reference these for DT and non-DT builds. For ACPI data node name matching, a helper function is also added which returns false if CONFIG_ACPI is not set, otherwise it performs a string comparison on the data node name. This avoids using the acpi_data_node struct for non CONFIG_ACPI builds, which would otherwise cause a build failure. Signed-off-by: Adam Thomson <Adam.Thomson.Opensource@diasemi.com> Acked-by: Sathyanarayana Nujella <sathyanarayana.nujella@intel.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org>	2016-06-26 12:39:03 +01:00
Guenter Roeck	d4ef930638	regmap-i2c: Use i2c block command only if register value width is 8 bit Chips with 16-bit registers don't usually work well with I2C block commands. For example, neither the LM75 datasheet nor the TMP102 datasheet mentions block command support, and in fact it does not work for any of those chips. Also, it is not clear how the block command would handle 16-bit SMBus operations in the fist place, since the data format associated with those commands is either little endian or big endian, which requires some kind of conversion to or from host byte order. Only use i2c block commands if both register and value width is 8 bit. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Mark Brown <broonie@kernel.org>	2016-06-22 13:50:51 +01:00
Linus Torvalds	607117a153	Driver core fixes for 4.7-rc4 Here are a small number of debugfs, ISA, and one driver core fix for 4.7-rc4. All of these resolve reported issues. The ISA ones have spent the least amount of time in linux-next, sorry about that, I didn't realize they were regressions that needed to get in now (thanks to Thorsten for the prodding!) but they do all pass the 0-day bot tests. The others have been in linux-next for a while now. Full details about them are in the shortlog below. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAldlbeQACgkQMUfUDdst+ymnFACfaWhEKA/84jwNNHiim92diJrY zYsAoLOmpBw68yL6qTSZbcWJF4Flb6Xk =N8M2 -----END PGP SIGNATURE----- Merge tag 'driver-core-4.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are a small number of debugfs, ISA, and one driver core fix for 4.7-rc4. All of these resolve reported issues. The ISA ones have spent the least amount of time in linux-next, sorry about that, I didn't realize they were regressions that needed to get in now (thanks to Thorsten for the prodding!) but they do all pass the 0-day bot tests. The others have been in linux-next for a while now. Full details about them are in the shortlog below" * tag 'driver-core-4.7-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: isa: Dummy isa_register_driver should return error code isa: Call isa_bus_init before dependent ISA bus drivers register watchdog: ebc-c384_wdt: Allow build for X86_64 iio: stx104: Allow build for X86_64 gpio: Allow PC/104 devices on X86_64 isa: Allow ISA-style drivers on modern systems base: make module_create_drivers_dir race-free debugfs: open_proxy_open(): avoid double fops release debugfs: full_proxy_open(): free proxy on ->open() failure kernel/kcov: unproxify debugfs file's fops	2016-06-18 06:04:01 -10:00
William Breathitt Gray	32a5a0c047	isa: Call isa_bus_init before dependent ISA bus drivers register The isa_bus_init function must be called before drivers which utilize the ISA bus driver are registered. A race condition for initilization exists if device_initcall is used (the isa_bus_init callback is placed in the same initcall level as dependent drivers which use module_init). This patch ensures that isa_bus_init is called first by utilizing postcore_initcall in favor of device_initcall. Fixes: `a5117ba7da` ("[PATCH] Driver model: add ISA bus") Cc: Rene Herman <rene.herman@keyaccess.nl> Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-06-17 20:47:11 -07:00
William Breathitt Gray	3a4955111a	isa: Allow ISA-style drivers on modern systems Several modern devices, such as PC/104 cards, are expected to run on modern systems via an ISA bus interface. Since ISA is a legacy interface for most modern architectures, ISA support should remain disabled in general. Support for ISA-style drivers should be enabled on a per driver basis. To allow ISA-style drivers on modern systems, this patch introduces the ISA_BUS_API and ISA_BUS Kconfig options. The ISA bus driver will now build conditionally on the ISA_BUS_API Kconfig option, which defaults to the legacy ISA Kconfig option. The ISA_BUS Kconfig option allows the ISA_BUS_API Kconfig option to be selected on architectures which do not enable ISA (e.g. X86_64). The ISA_BUS Kconfig option is currently only implemented for X86 architectures. Other architectures may have their own ISA_BUS Kconfig options added as required. Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-06-17 20:21:12 -07:00
Rafael J. Wysocki	9d066a2527	Merge branches 'pm-opp' and 'pm-cpufreq-fixes' * pm-opp: PM / OPP: Add 'UNKNOWN' status for shared_opp in struct opp_table * pm-cpufreq-fixes: cpufreq: intel_pstate: Adjust _PSS[0] freqeuency if needed	2016-06-18 01:55:13 +02:00
Viresh Kumar	79ee2e8f73	PM / OPP: Add 'UNKNOWN' status for shared_opp in struct opp_table dev_pm_opp_get_sharing_cpus() returns 0 even in the case when the OPP core doesn't know whether or not the table is shared. It works on the majority of platforms, where the OPP table is never created before invoking the function and then -ENODEV is returned by it. But in the case of one platform (Jetson TK1) at least, the situation is a bit different. The OPP table has been created (somehow) before dev_pm_opp_get_sharing_cpus() is called and it returns 0. Its caller treats that as 'the CPUs don't share OPPs' and that leads to degraded performance. Fix this by converting 'shared_opp' in struct opp_table to an enum and making dev_pm_opp_get_sharing_cpus() return -EINVAL in case when the value of that field is "access unknown", so that the caller can handle it accordingly (cpufreq-dt considers that as 'all CPUs share the table', for example). Fixes: `6f707daa38` "PM / OPP: Add dev_pm_opp_get_sharing_cpus()" Reported-and-tested-by: Alexandre Courbot <acourbot@nvidia.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw : Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-16 15:50:36 +02:00
Ulf Hansson	122a22377a	PM / Domains: Stop/start devices during system PM suspend/resume in genpd Not all subsystems/drivers that manages devices attached to a genpd makes use of the pm_runtime_force_suspend\|resume() helper functions to deal with system PM suspend/resume. In cases like these and when genpd's ->stop\|start() callbacks are used for the device, invoke the pm_runtime_force_suspend\|resume() helper functions from genpd's "noirq" system PM callbacks. In this way we make sure to "stop" the device on suspend and to "start" it on resume. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-16 15:14:37 +02:00
Ulf Hansson	4d23a5e848	PM / Domains: Allow runtime PM during system PM phases In cases when a PM domain isn't powered off when genpd's ->prepare() callback is invoked, genpd runtime resumes and disables runtime PM for the device. This behaviour was needed when genpd managed intermediate states during the power off sequence, as to maintain proper low power states of devices during system PM suspend/resume. Commit `ba2bbfbf63` (PM / Domains: Remove intermediate states from the power off sequence), enables genpd to improve its behaviour in that respect. The PM core disables runtime PM at __device_suspend_late() before it calls a system PM "late" callback for a device. When resuming a device, after a corresponding "early" callback has been invoked, the PM core re-enables runtime PM. By changing genpd to allow runtime PM according to the same system PM phases as the PM core, devices can be runtime resumed by their corresponding subsystem/driver when really needed. In this way, genpd no longer need to runtime resume the device from its ->prepare() callback. In most cases that avoids unnecessary and energy- wasting operations of runtime resuming devices that have nothing to do, only to runtime suspend them shortly after. Although, because of changing this behaviour in genpd and due to that genpd powers on the PM domain unconditionally in the system PM resume "noirq" phase, it could potentially cause a PM domain to stay powered on even if it's unused after the system has resumed. To avoid this, schedule a power off work when genpd's system PM ->complete() callback has been invoked for the last device in the PM domain. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-16 15:14:36 +02:00
Ulf Hansson	9f5b52747d	PM / Runtime: Avoid resuming devices again in pm_runtime_force_resume() If the runtime PM status of the device isn't RPM_SUSPENDED, prevent the pm_runtime_force_resume() from invoking the ->runtime_resume() callback for the device, as it's not the expected behaviour from the subsystem/driver. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-16 15:14:36 +02:00
Ulf Hansson	9b002b8f0e	PM / Domains: Remove redundant pm_request_idle() call in genpd The PM core increases the runtime PM usage count at the system PM prepare phase. Later when the system resumes, it does a pm_runtime_put() in the complete phase, which in addition to decrementing the usage count, does the equivalent of a pm_request_idle(). Therefore the call to pm_request_idle() from within genpd's ->complete() callback is redundant, so remove it. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-16 15:14:34 +02:00
Ulf Hansson	8001885389	PM / Domains: Remove redundant wrapper functions for system PM Due to the previous changes in genpd, which removed the suspend_power_off flag, several of the system PM callbacks no longer do any additional checks but only invoke corresponding pm_generic_* helper functions. To clean up the code, drop these wrapper functions as they have become redundant. Instead, assign the system PM callbacks directly to the pm_generic_*() helper functions. While changing this, it has bocame clear that some of the current system PM callbacks in genpd invoke wrong driver callbacks. For example, the genpd's ->restore() callback invokes pm_generic_resume(), while that should be pm_generic_restore(). Fix that as well. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-16 15:01:43 +02:00
Ulf Hansson	39dd0f234f	PM / Domains: Allow genpd to power on during system PM phases If a PM domain is powered off when the first device starts its system PM prepare phase, genpd prevents any further attempts to power on the PM domain during the following system PM phases. Not until the system PM complete phase is finalized for all devices in the PM domain, genpd again allows it to be powered on. This behaviour needs to be changed, as a subsystem/driver for a device in the same PM domain may still need to be able to serve requests in some of the system PM phases. Accordingly, it may need to runtime resume its device and thus also request the corresponding PM domain to be powered on. To deal with these scenarios, let's make the device operational in the system PM prepare phase by runtime resuming it, no matter if the PM domain is powered on or off. Changing this also enables us to remove genpd's suspend_power_off flag, as it's being used to track this condition. Additionally, we must allow the PM domain to be powered on via runtime PM during the system PM phases. This change also requires a fix in the AMD ACP (Audio CoProcessor) drm driver. It registers a genpd to model the ACP as a PM domain, but unfortunately it's also abuses genpd's "internal" suspend_power_off flag to deal with a corner case at system PM resume. More precisely, the so called SMU block powers on the ACP at system PM resume, unconditionally if it's being used or not. This may lead to that genpd's internal status of the power state, may not correctly reflect the power state of the HW after a system PM resume. Because of changing the behaviour of genpd, by runtime resuming devices in the prepare phase, the AMD ACP drm driver no longer have to deal with this corner case. So let's just drop the related code in this driver. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Acked-by: Maruthi Bayyavarapu <maruthi.bayyavarapu@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-16 15:01:43 +02:00
Jiri Slaby	7e1b1fc4da	base: make module_create_drivers_dir race-free Modules which register drivers via standard path (driver_register) in parallel can cause a warning: WARNING: CPU: 2 PID: 3492 at ../fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80 sysfs: cannot create duplicate filename '/module/saa7146/drivers' Modules linked in: hexium_gemini(+) mxb(+) ... ... Call Trace: ... [<ffffffff812e63a2>] sysfs_warn_dup+0x62/0x80 [<ffffffff812e6487>] sysfs_create_dir_ns+0x77/0x90 [<ffffffff8140f2c4>] kobject_add_internal+0xb4/0x340 [<ffffffff8140f5b8>] kobject_add+0x68/0xb0 [<ffffffff8140f631>] kobject_create_and_add+0x31/0x70 [<ffffffff8157a703>] module_add_driver+0xc3/0xd0 [<ffffffff8155e5d4>] bus_add_driver+0x154/0x280 [<ffffffff815604c0>] driver_register+0x60/0xe0 [<ffffffff8145bed0>] __pci_register_driver+0x60/0x70 [<ffffffffa0273e14>] saa7146_register_extension+0x64/0x90 [saa7146] [<ffffffffa0033011>] hexium_init_module+0x11/0x1000 [hexium_gemini] ... As can be (mostly) seen, driver_register causes this call sequence: -> bus_add_driver -> module_add_driver -> module_create_drivers_dir The last one creates "drivers" directory in /sys/module/<...>. When this is done in parallel, the directory is attempted to be created twice at the same time. This can be easily reproduced by loading mxb and hexium_gemini in parallel: while :; do modprobe mxb & modprobe hexium_gemini wait rmmod mxb hexium_gemini saa7146_vv saa7146 done saa7146 calls pci_register_driver for both mxb and hexium_gemini, which means /sys/module/saa7146/drivers is to be created for both of them. Fix this by a new mutex in module_create_drivers_dir which makes the test-and-create "drivers" dir atomic. I inverted the condition and removed 'return' to avoid multiple unlocks or a goto. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Fixes: `fe480a2675` (Modules: only add drivers/ direcory if needed) Cc: v2.6.21+ <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-06-15 19:21:31 -07:00
Paul Gortmaker	29b968b2af	PM / clk: export symbols for existing pm_clk_<...> API fcns While trying to convert a DMA driver from bool to tristate, we encountered the following: ERROR: "pm_clk_add_clk" [drivers/dma/tegra210-adma.ko] undefined! ERROR: "pm_clk_create" [drivers/dma/tegra210-adma.ko] undefined! ERROR: "pm_clk_destroy" [drivers/dma/tegra210-adma.ko] undefined! ERROR: "pm_clk_suspend" [drivers/dma/tegra210-adma.ko] undefined! ERROR: "pm_clk_resume" [drivers/dma/tegra210-adma.ko] undefined! Since in principle there is nothing preventing these functions from being used in modular code as well as builtin, we add the export of them. We expand the scope to also include: pm_clk_add of_pm_clk_add_clks pm_clk_remove pm_clk_remove_clk pm_clk_init pm_clk_runtime_suspend pm_clk_runtime_resume pm_clk_add_notifier ...since these functions are also non-static and presumably form part of the existing API used by other drivers that may become modular in the future. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-06-15 01:42:15 +02:00
Heiko Carstens	a62247e1f5	topology/sysfs: provide drawer id and siblings attributes The s390 cpu topology gained another hierarchy level. The top level is now called drawer and contains several books. A book used to be the top level. In order to expose the cpu topology to user space allow to create new sysfs attributes dependent on CONFIG_SCHED_DRAWER which an architecture may define and select. These additional attributes will be available: /sys/devices/system/cpu/cpuX/topology/drawer_id /sys/devices/system/cpu/cpuX/topology/drawer_siblings /sys/devices/system/cpu/cpuX/topology/drawer_siblings_list Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2016-06-13 15:58:27 +02:00
Arnd Bergmann	463a86304c	char/genrtc: x86: remove remnants of asm/rtc.h Commit `3195ef59cb` ("x86: Do full rtc synchronization with ntp") had the side-effect of unconditionally enabling the RTC_LIB symbol on x86, which in turn disables the selection of the CONFIG_RTC and CONFIG_GEN_RTC drivers that contain a two older implementations of the CONFIG_RTC_DRV_CMOS driver. This removes x86 from the list for genrtc, and changes all references to the asm/rtc.h header to instead point to the interfaces from linux/mc146818rtc.h. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>	2016-06-04 00:20:07 +02:00
Laxman Dewangan	ccc1256192	regmap: irq: Add support to call client specific pre/post interrupt service Regmap irq implements the generic interrupt service routine which is common for most of devices. Some devices, like MAX77620, MAX20024 needs the special handling before and after servicing the interrupt as generic. For the example, MAX77620 programming guidelines for interrupt servicing says: 1. When interrupt occurs from PMIC, mask the PMIC interrupt by setting GLBLM. 2. Read IRQTOP and service the interrupt accordingly. 3. Once all interrupts has been checked and serviced, the interrupt service routine un-masks the hardware interrupt line by clearing GLBLM. The step (2) is implemented in regmap irq as generic routine. For step (1) and (3), add callbacks from regmap irq to client driver to handle chip specific configurations. Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Signed-off-by: Mark Brown <broonie@kernel.org>	2016-06-03 00:41:15 +01:00
Linus Torvalds	877c057d2b	More power management updates for v4.7-rc1 - Stable-candidate cpuidle fix to make it check the right variable when deciding whether or not to enable interrupts on the local CPU so as to avoid enabling iterrupts too early in some cases if the system has both coupled and per-core idle states (Daniel Lezcano). - Stable-candidate PM core fix to make it handle failures at the "late suspend" stage of device suspend consistently for all devices regardless of whether or not async suspend/resume is enabled for them (Rafael Wysocki). - Cleanups in the cpufreq core, the schedutil governor and the intel_pstate driver (Rafael Wysocki, Pankaj Gupta, Viresh Kumar). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJXRglcAAoJEILEb/54YlRxCwEQAKQNLO5wrNnOQmVrMHaOk1XH 4zT+AdggLRVgBld4d4Vcli2zJPZfpnZuzjJdwTB5zLgQ4WIcBb6meOfH87XGqRyJ o4ksyEUhpvDk8AdlmTA5CvvjFuydPJG5ZUSiM035XRT9heebvhgyaMBnT3ucXbq9 7LhNhCQ+a8arndt9ePO7tZnFfQQUbwNJ2BDVuH5DJBqMIFOo2/Kpag43CdFWlWZT jnWaleDCjSmanuJ/45bFJHJeSZ7PK2etnArfzKtb9QLSGnuEfFPdHuUzJYo5dkP7 UBeYA94hhfR3f5FJIqNlF3N+eLEX1idpwxc8+CJLLDKDd1ZCBrbLoz5fwM+fVn0h AfmyR+J1czcbiphsmpViOYDRrKdiQVkbP6SpBswvgMCZAcNDF2bxhzOlcuTUc+u0 8xsjWOtArL6uvzsAHa1HY6hhgUn9FB8m20HX+DmS2/zzqyzoRefenoyVcuLsAhXC fm+sARQ7tvy3OoGRQ9mloWgv2X5iQUY5IVjOG2amIhbUvVmKQutPjVTGTwHqmmcb 2nNYptLsTA6crvnexPcPHY+OFjkQl/omtfaMx+OJl63yhln5ibveGOfZ6F8sPdoB bRqHuHoK/xh9hSNwj117ZFzq1nm54mLjh0Yhw3EXFcV4I9vdsTp8/WeNThGvT17j M+6PDXyjlwh3HZpGm+HW =3vtL -----END PGP SIGNATURE----- Merge tag 'pm-4.7-rc1-more' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more power management updates from Rafael Wysocki: "These are two stable-candidate fixes (PM core, cpuidle) and a bunch of cpufreq cleanups. Specifics: - Stable-candidate cpuidle fix to make it check the right variable when deciding whether or not to enable interrupts on the local CPU so as to avoid enabling iterrupts too early in some cases if the system has both coupled and per-core idle states (Daniel Lezcano). - Stable-candidate PM core fix to make it handle failures at the "late suspend" stage of device suspend consistently for all devices regardless of whether or not async suspend/resume is enabled for them (Rafael Wysocki). - Cleanups in the cpufreq core, the schedutil governor and the intel_pstate driver (Rafael Wysocki, Pankaj Gupta, Viresh Kumar)" * tag 'pm-4.7-rc1-more' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM / sleep: Handle failures in device_suspend_late() consistently cpufreq: schedutil: Improve prints messages with pr_fmt cpuidle: Fix cpuidle_state_is_coupled() argument in cpuidle_enter() cpufreq: simplified goto out in cpufreq_register_driver() cpufreq: governor: CPUFREQ_GOV_STOP never fails cpufreq: governor: CPUFREQ_GOV_POLICY_EXIT never fails intel_pstate: Simplify conditional in intel_pstate_set_policy()	2016-05-25 15:29:21 -07:00
Rafael J. Wysocki	4c2628cd75	Merge branches 'pm-cpufreq', 'pm-cpuidle' and 'pm-core' * pm-cpufreq: cpufreq: schedutil: Improve prints messages with pr_fmt cpufreq: simplified goto out in cpufreq_register_driver() cpufreq: governor: CPUFREQ_GOV_STOP never fails cpufreq: governor: CPUFREQ_GOV_POLICY_EXIT never fails intel_pstate: Simplify conditional in intel_pstate_set_policy() * pm-cpuidle: cpuidle: Fix cpuidle_state_is_coupled() argument in cpuidle_enter() * pm-core: PM / sleep: Handle failures in device_suspend_late() consistently	2016-05-25 21:54:45 +02:00
Linus Torvalds	3aa2fc1667	driver core update for 4.7-rc1 Here's the "big" driver core update for 4.7-rc1. Mostly just debugfs changes, the long-known and messy races with removing debugfs files should be fixed thanks to the great work of Nicolai Stange. We also have some isa updates in here (the x86 maintainers told me to take it through this tree), a new warning when we run out of dynamic char major numbers, and a few other assorted changes, details in the shortlog. All have been in linux-next for some time with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlc/0mwACgkQMUfUDdst+ynjXACgjNxR5nMUiM8ZuuD0i4Xj7VXd hnIAoM08+XDCv41noGdAcKv+2WZVZWMC =i+0H -----END PGP SIGNATURE----- Merge tag 'driver-core-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here's the "big" driver core update for 4.7-rc1. Mostly just debugfs changes, the long-known and messy races with removing debugfs files should be fixed thanks to the great work of Nicolai Stange. We also have some isa updates in here (the x86 maintainers told me to take it through this tree), a new warning when we run out of dynamic char major numbers, and a few other assorted changes, details in the shortlog. All have been in linux-next for some time with no reported issues" * tag 'driver-core-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits) Revert "base: dd: don't remove driver_data in -EPROBE_DEFER case" gpio: ws16c48: Utilize the ISA bus driver gpio: 104-idio-16: Utilize the ISA bus driver gpio: 104-idi-48: Utilize the ISA bus driver gpio: 104-dio-48e: Utilize the ISA bus driver watchdog: ebc-c384_wdt: Utilize the ISA bus driver iio: stx104: Utilize the module_isa_driver and max_num_isa_dev macros iio: stx104: Add X86 dependency to STX104 Kconfig option Documentation: Add ISA bus driver documentation isa: Implement the max_num_isa_dev macro isa: Implement the module_isa_driver macro pnp: pnpbios: Add explicit X86_32 dependency to PNPBIOS isa: Decouple X86_32 dependency from the ISA Kconfig option driver-core: use 'dev' argument in dev_dbg_ratelimited stub base: dd: don't remove driver_data in -EPROBE_DEFER case kernfs: Move faulting copy_user operations outside of the mutex devcoredump: add scatterlist support debugfs: unproxify files created through debugfs_create_u32_array() debugfs: unproxify files created through debugfs_create_blob() debugfs: unproxify files created through debugfs_create_bool() ...	2016-05-20 21:26:15 -07:00
Rafael J. Wysocki	3a17fb329d	PM / sleep: Handle failures in device_suspend_late() consistently Grygorii Strashko reports: The PM runtime will be left disabled for the device if its .suspend_late() callback fails and async suspend is not allowed for this device. In this case device will not be added in dpm_late_early_list and dpm_resume_early() will ignore this device, as result PM runtime will be disabled for it forever (side effect: after 8 subsequent failures for the same device the PM runtime will be reenabled due to disable_depth overflow). To fix this problem, add devices to dpm_late_early_list regardless of whether or not device_suspend_late() returns errors for them. That will ensure failures in there to be handled consistently for all devices regardless of their async suspend/resume status. Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: All applicable <stable@vger.kernel.org>	2016-05-20 23:09:49 +02:00
Linus Torvalds	16490980e3	Generic device properties framework update for v4.7-rc1 Just one commit reworking the handling of built-in properties initialization and updating a few drivers in accordance with the core framework changes. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJXOjO2AAoJEILEb/54YlRxiksP/jJjZZ5B7r7iBCLhtornA9L4 I54t0X4HcxmkBd2k5VxBE8k0bzMo/Tx52Bm/YnKxKQS7CSVPr7Lhrj3pH2NoQO5i S5T4Wdi8AYK/0LXcCwEfyMCAr7M9mstc+v2mQu1ZPtGKSI/c9+3r3zCrtSbxjwK8 RU4RSa3Y27drL8wbyc2PvjytYcbmwM4BsLOL8v4noUe2WussN4LEjiN3q4Y+s6O8 gmSPCZDY5Qg1Nc1YBse+lPsHHUng9cgIHzAnQkN5CnM36KLoNG3KaYGu+0ZxSnMe Ed+NaCZZC+u43r7vaJaOpnE+1jxYLOw4w7ail6GHveaI1EYKUtaRF8KcUDn1wwkU MxYslt5i8LS0ar2DPJ/RgHZ2ajaZAPjdie/+PJQwO6e5PqQsCDOLia8zWTS6HEMf DTirJrRfy2o1QR693I8MSwm5hsFiOPTeYXexWxLAe+LtUN277qaZrFAp7DSSEKm1 p53K9xaWmd2gJUTsmnim0TlTA2j9PO9dCJeXJDBa2B6QR11oQ160BvZ6/kk5U/FZ /XSbixwvRs2i6H/lVNgqyGmJdIAt75HcCUwNw4QI76nHrxx1BirA2DNbrGAlsNAF eBYaZ7pYtUN4/nZh+jYJWki7pY46F/7OgGoQQzM2LR6W8dz7Cwt/OQyt0+ekLhN4 prNP1Br++rFeCIc7kFOk =O8T9 -----END PGP SIGNATURE----- Merge tag 'device-properties-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull device properties update from Rafael Wysocki: "Generic device properties framework update. Just one commit reworking the handling of built-in properties initialization and updating a few drivers in accordance with the core framework changes" * tag 'device-properties-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: device property: don't bother the drivers with struct property_set	2016-05-16 19:51:04 -07:00
Linus Torvalds	d57d394319	Power management material for v4.7-rc1 - New cpufreq "schedutil" governor (making decisions based on CPU utilization information provided by the scheduler and capable of switching CPU frequencies right away if the underlying driver supports that) and support for fast frequency switching in the acpi-cpufreq driver (Rafael Wysocki). - Consolidation of CPU frequency management on ARM platforms allowing them to get rid of some platform-specific boilerplate code if they are going to use the cpufreq-dt driver (Viresh Kumar, Finley Xiao, Marc Gonzalez). - Support for ACPI _PPC and CPU frequency limits in the intel_pstate driver (Srinivas Pandruvada). - Fixes and cleanups in the cpufreq core and generic governor code (Rafael Wysocki, Sai Gurrappadi). - intel_pstate driver optimizations and cleanups (Rafael Wysocki, Philippe Longepe, Chen Yu, Joe Perches). - cpufreq powernv driver fixes and cleanups (Akshay Adiga, Shilpasri Bhat). - cpufreq qoriq driver fixes and cleanups (Jia Hongtao). - ACPI cpufreq driver cleanups (Viresh Kumar). - Assorted cpufreq driver updates (Ashwin Chaugule, Geliang Tang, Javier Martinez Canillas, Paul Gortmaker, Sudeep Holla). - Assorted cpufreq fixes and cleanups (Joe Perches, Arnd Bergmann). - Fixes and cleanups in the OPP (Operating Performance Points) framework, mostly related to OPP sharing, and reorganization of OF-dependent code in it (Viresh Kumar, Arnd Bergmann, Sudeep Holla). - New "passive" governor for devfreq (for SoC subsystems that will rely on someone else for the management of their power resources) and consolidation of devfreq support for Exynos platforms, coding style and typo fixes for devfreq (Chanwoo Choi, MyungJoo Ham). - PM core fixes and cleanups, mostly to make it work better with the generic power domains (genpd) framework, and updates for that framework (Ulf Hansson, Thierry Reding, Colin Ian King). - Intel Broxton support for the intel_idle driver (Len Brown). - cpuidle core optimization and fix (Daniel Lezcano, Dave Gerlach). - ARM cpuidle cleanups (Jisheng Zhang). - Intel Kabylake support for the RAPL power capping driver (Jacob Pan). - AVS (Adaptive Voltage Switching) rockchip-io driver update (Heiko Stuebner). - Updates for the cpupower tool (Arjun Sreedharan, Colin Ian King, Mattia Dongili, Thomas Renninger). / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJXOjLgAAoJEILEb/54YlRxfn0P/RbSPpNlUNBIE8DFrdD9jRdJ TIpZ7uiHi9tU1ZF17UBbb/SwuWfYVnVmiorZGRfFOtGaoqh0HFZ/nplDz99rK0ku vW2OnbojMQEUMU3IcUT1y4BsSl0H23f7ZOKrdprALeWxDQmbgnYjrE6vkX6hRtld A8eeZvIEJ5CzV8S+9aOOOpojW2yXk5dYGdZ7gpQdoM0n7zVLyPnNucJoha3BYmOG FwKEIe05RpIhfLfGT0CXIRcOzwAZ6ZWKgOrXUrx/AadPbvu/TP9zkI0djYI8ukyv z2oiO/GExoeGVuUzvy8vY5SiH4NQvViftFzMZepcsmjxmVglohMPRL8VLjZIBckk DDcqH9e0OQI20jjYT1vIf5+JWBvLxuQfGtyzI0S+sE/elB1zI/3O8p+8N2CuF5n+ my2dawIewnHI/0AdSpJ+K7DVrfwPHAX19axtPX3dJSLh2OuHCPNlAtbxRGAriBfH Zv9NETxlrch69o2AD4K54DErWV1FsYLznzK5Zms6MC2Ispbb+oiYpacTlZblznvb H5U2SSNlA5Niir3vVJ01nKRtzxlWoi67CQxbYrGhlaR0nTTxf9HqWgcSiTZrn7Pv hs+LA2aUfMf3JGjStdORS7S8biQSid5vypfkglpWLZBKHNC9BqqZd9gSM+jF3FVh ps4mMM4UXY4hnoFDkMBI =WM89 -----END PGP SIGNATURE----- Merge tag 'pm-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The majority of changes go into the cpufreq subsystem this time. To me, quite obviously, the biggest ticket item is the new "schedutil" governor. Interestingly enough, it's the first new cpufreq governor since the beginning of the git era (except for some out-of-the-tree ones). There are two main differences between it and the existing governors. First, it uses the information provided by the scheduler directly for making its decisions, so it doesn't have to track anything by itself. Second, it can invoke drivers (supporting that feature) to adjust CPU performance right away without having to spawn work items to be executed in process context or similar. Currently, the acpi-cpufreq driver is the only one supporting that mode of operation, but then it is used on a large number of systems. The "schedutil" governor as included here is very simple and mostly regarded as a foundation for future work on the integration of the scheduler with CPU power management (in fact, there is work in progress on top of it already). Nevertheless it works and the preliminary results obtained with it are encouraging. There also is some consolidation of CPU frequency management for ARM platforms that can add their machine IDs the the new stub dt-platdev driver now and that will take care of creating the requisite platform device for cpufreq-dt, so it is not necessary to do that in platform code any more. Several ARM platforms are switched over to using this generic mechanism. In addition to that, the intel_pstate driver is now going to respect CPU frequency limits set by the platform firmware (or a BMC) and provided via the ACPI _PPC object. The devfreq subsystem is getting a new "passive" governor for SoCs subsystems that will depend on somebody else to manage their voltage rails and its support for Samsung Exynos SoCs is consolidated. The rest is support for new hardware (Intel Broxton support in intel_idle for one example), bug fixes, optimizations and cleanups in a number of places. Specifics: - New cpufreq "schedutil" governor (making decisions based on CPU utilization information provided by the scheduler and capable of switching CPU frequencies right away if the underlying driver supports that) and support for fast frequency switching in the acpi-cpufreq driver (Rafael Wysocki) - Consolidation of CPU frequency management on ARM platforms allowing them to get rid of some platform-specific boilerplate code if they are going to use the cpufreq-dt driver (Viresh Kumar, Finley Xiao, Marc Gonzalez) - Support for ACPI _PPC and CPU frequency limits in the intel_pstate driver (Srinivas Pandruvada) - Fixes and cleanups in the cpufreq core and generic governor code (Rafael Wysocki, Sai Gurrappadi) - intel_pstate driver optimizations and cleanups (Rafael Wysocki, Philippe Longepe, Chen Yu, Joe Perches) - cpufreq powernv driver fixes and cleanups (Akshay Adiga, Shilpasri Bhat) - cpufreq qoriq driver fixes and cleanups (Jia Hongtao) - ACPI cpufreq driver cleanups (Viresh Kumar) - Assorted cpufreq driver updates (Ashwin Chaugule, Geliang Tang, Javier Martinez Canillas, Paul Gortmaker, Sudeep Holla) - Assorted cpufreq fixes and cleanups (Joe Perches, Arnd Bergmann) - Fixes and cleanups in the OPP (Operating Performance Points) framework, mostly related to OPP sharing, and reorganization of OF-dependent code in it (Viresh Kumar, Arnd Bergmann, Sudeep Holla) - New "passive" governor for devfreq (for SoC subsystems that will rely on someone else for the management of their power resources) and consolidation of devfreq support for Exynos platforms, coding style and typo fixes for devfreq (Chanwoo Choi, MyungJoo Ham) - PM core fixes and cleanups, mostly to make it work better with the generic power domains (genpd) framework, and updates for that framework (Ulf Hansson, Thierry Reding, Colin Ian King) - Intel Broxton support for the intel_idle driver (Len Brown) - cpuidle core optimization and fix (Daniel Lezcano, Dave Gerlach) - ARM cpuidle cleanups (Jisheng Zhang) - Intel Kabylake support for the RAPL power capping driver (Jacob Pan) - AVS (Adaptive Voltage Switching) rockchip-io driver update (Heiko Stuebner) - Updates for the cpupower tool (Arjun Sreedharan, Colin Ian King, Mattia Dongili, Thomas Renninger)" * tag 'pm-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (112 commits) intel_pstate: Clean up get_target_pstate_use_performance() intel_pstate: Use sample.core_avg_perf in get_avg_pstate() intel_pstate: Clarify average performance computation intel_pstate: Avoid unnecessary synchronize_sched() during initialization cpufreq: schedutil: Make default depend on CONFIG_SMP cpufreq: powernv: del_timer_sync when global and local pstate are equal cpufreq: powernv: Move smp_call_function_any() out of irq safe block intel_pstate: Clean up intel_pstate_get() cpufreq: schedutil: Make it depend on CONFIG_SMP cpufreq: governor: Fix handling of special cases in dbs_update() PM / OPP: Move CONFIG_OF dependent code in a separate file cpufreq: intel_pstate: Ignore _PPC processing under HWP cpufreq: arm_big_little: use generic OPP functions for {init, free}_opp_table PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table cpufreq: tango: Use generic platdev driver PM / OPP: pass cpumask by reference cpufreq: Fix GOV_LIMITS handling for the userspace governor cpupower: fix potential memory leak PM / devfreq: style/typo fixes PM / devfreq: exynos: Add the detailed correlation for Exynos5422 bus ..	2016-05-16 19:17:22 -07:00
Rafael J. Wysocki	27c4a1c5ef	Merge branches 'pm-avs', 'pm-clk', 'powercap' and 'pm-tools' * pm-avs: PM / AVS: rockchip-io: make io-domains a child of the GRF * pm-clk: PM / clk: ensure we don't allocate a -ve size of count clks * powercap: powercap/intel_rapl: Add support for Kabylake * pm-tools: cpupower: fix potential memory leak cpupower: Add cpuidle parts into library cpupowerutils: bench: trivial fix of spelling mistake on "average" Fix cpupower manpages "NAME" section cpupower: bench: parse.c: fix several resource leaks Honour user's LDFLAGS	2016-05-16 14:31:56 +02:00
Rafael J. Wysocki	aa24781b1c	Merge branches 'pm-core' and 'pm-domains' * pm-core: PM / sleep: Drop unused `info' variable PM / Runtime: Move ignore_children flag under CONFIG_PM PM / Runtime: Fix error path in pm_runtime_force_resume() * pm-domains: PM / Domains: Drop unnecessary wakeup code from pm_genpd_prepare() PM / Domains: Remove redundant pm_runtime_get\|put*() in pm_genpd_prepare() PM / Domains: Remove ->save\|restore_state() callbacks PM / Domains: Rename pm_genpd_runtime_suspend\|resume() PM / Domains: Rename stop_ok to suspend_ok for the genpd governor	2016-05-16 14:31:29 +02:00
Rafael J. Wysocki	29cff1844a	Merge branch 'pm-opp' * pm-opp: PM / OPP: Move CONFIG_OF dependent code in a separate file PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table PM / OPP: pass cpumask by reference PM / OPP: Add dev_pm_opp_get_sharing_cpus() PM / OPP: Mark cpumask as const in dev_pm_opp_set_sharing_cpus() PM / OPP: -ENOSYS is applicable only to syscalls PM / OPP: Mark shared-opp for non-dt case PM / OPP: Relocate dev_pm_opp_set_sharing_cpus() PM / OPP: dev_pm_opp_set_sharing_cpus() doesn't depend on CONFIG_OF PM / OPP: Add missing doc style comments PM / OPP: Propagate the error returned by _find_opp_table()	2016-05-16 14:30:14 +02:00
Mark Brown	d4ab78d707	Merge remote-tracking branches 'regmap/topic/doc' and 'regmap/topic/flat' into regmap-next	2016-05-13 10:36:14 +01:00
Mark Brown	2a2cd52190	Merge remote-tracking branches 'regmap/fix/be', 'regmap/fix/doc' and 'regmap/fix/spmi' into regmap-linus	2016-05-13 10:36:10 +01:00
Mark Brown	066a0e0b49	Merge remote-tracking branch 'regmap/fix/mmio' into regmap-linus	2016-05-13 10:36:09 +01:00
Rafael J. Wysocki	dab2e29402	Merge back new device properties material for v4.7.	2016-05-06 22:07:33 +02:00
Rafael J. Wysocki	c8541203a6	Merge back new material for v4.7.	2016-05-06 22:05:16 +02:00
Viresh Kumar	f47b72a15a	PM / OPP: Move CONFIG_OF dependent code in a separate file Recently, a few issues were noticed in the code where CONFIG_OF wasn't consistently used for many routines. The core file is big enough now and ifdef hackery makes it less readable. Move OF-specific code to another file and compile that only if CONFIG_OF is enabled. Compile-tested: - For ARM (exynos) with CONFIG_OF enabled - For X86 with CONFIG_OF disabled (have to enable CONFIG_PM_OPP separately) No functional changes. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-05-06 13:22:49 +02:00
Rafael J. Wysocki	5f2f88e330	Merge branches 'pm-opp-fixes', 'pm-cpufreq-fixes' and 'pm-cpuidle-fixes' * pm-opp-fixes: PM / OPP: Remove useless check * pm-cpufreq-fixes: intel_pstate: Fix intel_pstate_get() cpufreq: intel_pstate: Fix HWP on boot CPU after system resume cpufreq: st: enable selective initialization based on the platform * pm-cpuidle-fixes: ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value	2016-05-06 13:16:22 +02:00
Viresh Kumar	21f8a99ce6	PM / OPP: Remove useless check Regulators are optional for devices using OPPs and the OPP core shouldn't be printing any errors for such missing regulators. It was fine before the commit `0c717d0f9c`, but that failed to update this part of the code to remove an 'always true' check and an extra unwanted print message. Fix that now. Fixes: `0c717d0f9c` (PM / OPP: Initialize regulator pointer to an error value) Reported-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-05-05 01:42:19 +02:00
Sudeep Holla	411466c508	PM / OPP: add non-OF versions of dev_pm_opp_{cpumask_, }remove_table Functions dev_pm_opp_of_{cpumask_,}remove_table removes/frees all the static OPP entries associated with the device and/or all cpus(in case of cpumask) that are created from DT. However the OPP entries are populated reading from the firmware or some different method using dev_pm_opp_add are marked dynamic and can't be removed using above functions. This patch adds non DT/OF versions of dev_pm_opp_{cpumask_,}remove_table to support the above mentioned usecase. This is in preparation to make use of the same in scpi-cpufreq.c Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-05-05 01:38:44 +02:00
Arnd Bergmann	ddbb74bc70	PM / OPP: pass cpumask by reference The new use of dev_pm_opp_set_sharing_cpus resulted in a harmless compiler warning with CONFIG_CPUMASK_OFFSTACK=y: drivers/cpufreq/mvebu-cpufreq.c: In function 'armada_xp_pmsu_cpufreq_init': include/linux/cpumask.h:550:25: error: passing argument 2 of 'dev_pm_opp_set_sharing_cpus' discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers] The problem here is that cpumask_var_t gets passed by reference, but by declaring a 'const cpumask_var_t' argument, only the pointer is constant, not the actual mask. This is harmless because the function does not actually modify the mask. This patch changes the function prototypes for all of the related functions to pass a 'struct cpumask *' instead of 'cpumask_var_t', matching what most other such functions do in the kernel. This lets us mark all the other similar functions as taking a 'const' mask where possible, and it avoids the warning without any change in object code. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: `947bd567f7` (mvebu: Use dev_pm_opp_set_sharing_cpus() to mark OPP tables as shared) Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-05-05 01:34:20 +02:00
Greg Kroah-Hartman	c6e360a0d9	Revert "base: dd: don't remove driver_data in -EPROBE_DEFER case" This reverts commit `ded9db380d`. Thierry Reding writes: This causes a boot regression on at least one board, caused by one of the drivers looking at driver data to check whether or not the driver has properly loaded. If the code encounters a non-NULL pointer it tries to dereference it, but because it's already been freed there is no memory backing it and things crash. I don't think keeping stale pointers around is a good idea. The whole point of setting this to NULL in the core is so that probe failures result in the same starting conditions no matter what. Can we please get this reverted? Reported-by: Thierry Reding <thierry.reding@gmail.com> Cc: Yi Zhang <yizhang_hust@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-05-03 08:06:06 -07:00
William Breathitt Gray	8ac0fba2da	isa: Decouple X86_32 dependency from the ISA Kconfig option The introduction of the ISA_BUS option blocks the compilation of ISA drivers on non-x86 platforms. The ISA_BUS configuration option should not be necessary if the X86_32 dependency can be decoupled from the ISA configuration option. This patch both removes the ISA_BUS configuration option entirely and removes the X86_32 dependency from the ISA configuration option. Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-05-01 20:21:02 -07:00
Yi Zhang	ded9db380d	base: dd: don't remove driver_data in -EPROBE_DEFER case the driver_data may be used for sanity check, it fails the probe() if driver_data is NULL after it is re-triggered. for example, soc_probe() in sound/soc/soc-core.c Signed-off-by: Yi Zhang <yizhang_hust@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-05-01 14:22:17 -07:00
Viresh Kumar	6f707daa38	PM / OPP: Add dev_pm_opp_get_sharing_cpus() OPP core allows a platform to mark OPP table as shared, when the platform isn't using operating-points-v2 bindings. And, so there should be a non DT way of finding out if the OPP table is shared or not. This patch adds dev_pm_opp_get_sharing_cpus(), which first tries to get OPP sharing information from the opp-table (in case it is already marked as shared), otherwise it uses the existing DT way of finding sharing information. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:18:18 +02:00
Viresh Kumar	dde370b23c	PM / OPP: Mark cpumask as const in dev_pm_opp_set_sharing_cpus() dev_pm_opp_set_sharing_cpus() isn't supposed to update the cpumask passed as its parameter, and so it should always have been marked 'const'. Do it now. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:18:18 +02:00
Thierry Reding	fba1fbf563	PM / sleep: Drop unused `info' variable Commit `32e8d689dc` (PM / sleep: trace_device_pm_callback coverage in dpm_prepare/complete) removed all users of this variable but forgot to remove the variable itself. Signed-off-by: Thierry Reding <treding@nvidia.com> [ rjw: Subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-28 15:13:01 +02:00
Heikki Krogerus	0224a4a30b	device property: Avoid potential dereferences of invalid pointers Since fwnode may hold ERR_PTR(-ENODEV) or it may be NULL, the fwnode type checks is_of_node(), is_acpi_node() and is is_pset_node() need to consider it. Using IS_ERR_OR_NULL() to check it. Fixes: `0d67e0fa16` (device property: fix for a case of use-after-free) Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-27 23:40:02 +02:00
Ulf Hansson	164a2159a2	PM / Domains: Drop unnecessary wakeup code from pm_genpd_prepare() As the PM core already have wakeup management during the system PM phase, it seems reasonable that genpd and its users should be able to rely on that. Therefore let's remove this from pm_genpd_prepare(). Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-26 22:28:39 +02:00
Ulf Hansson	624c8df7d2	PM / Domains: Remove redundant pm_runtime_get\|put() in pm_genpd_prepare() The PM core increases and decreases the runtime PM usage count in the system PM prepare phase. This makes some of the pm_runtime_get\|put() calls in pm_genpd_prepare() redundant, so let's remove them. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-26 22:27:07 +02:00
Colin Ian King	0b26985a7d	PM / clk: ensure we don't allocate a -ve size of count clks It is entirely possible for of_count_phandle_wit_args to return a -ve error return value so we need to check for this otherwise we end up allocating a negative number of clk objects. Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-26 21:11:35 +02:00
Viresh Kumar	46e7a4e183	PM / OPP: Mark shared-opp for non-dt case opp core allows OPPs to be explicitly marked as shared from platform code, in case of operating-point v1 bindings. Though we do everything fine in that case, we don't set the flag in the opp-table to indicate that the OPPs are shared. It works fine today as the flag isn't used anywhere else in the core, but we should be doing the right thing by marking it set. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:10:18 +02:00
Viresh Kumar	2c93104ff2	PM / OPP: Relocate dev_pm_opp_set_sharing_cpus() Move dev_pm_opp_set_sharing_cpus() towards the end of the file. This is required for better readability after the next patch is applied, which adds dev_pm_opp_get_sharing_cpus(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:10:16 +02:00
Viresh Kumar	45ca36ad7c	PM / OPP: Add missing doc style comments Few of the routines in cpu.c were missing these, add them. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:10:14 +02:00
Viresh Kumar	2d74c6d565	PM / OPP: Propagate the error returned by _find_opp_table() Don't send -EINVAL and propagate what's received from _find_opp_table(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-25 16:10:11 +02:00
Ulf Hansson	54eeddbf92	PM / Domains: Remove ->save\|restore_state() callbacks As a part of the ongoing consolidation of genpd, it's become questionable whether clients actually needs to be able to assign their own set of ->save\|restore_state() callbacks. Currently all users copes fine with the default callbacks, so let's remove the configuration option and stick to the default ones. This enables further clarifications of the related code and let's also rename pm_genpd_default_save\|restore_state() into __genpd_runtime_suspend\|resume() to apply the rule of static functionnames in genpd. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-22 02:29:17 +02:00
Ulf Hansson	795bd2e7e8	PM / Domains: Rename pm_genpd_runtime_suspend\|resume() Follow genpd's rule for names of static functions, by renaming pm_genpd_runtime_suspend\|resume() to genpd_runtime_suspend\|resume(). Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-22 02:29:17 +02:00
Ulf Hansson	9df3921e02	PM / Domains: Rename stop_ok to suspend_ok for the genpd governor The genpd governor validates the latency constraints to find out whether it's acceptable to runtime suspend a device. Earlier this validation was made to know whether it was okay to invoke the ->stop() callback for the device, hence the governor used the name "stop_ok" for the related variables. To clarify the code around this, let's rename these variables from "stop_ok" to "suspend_ok". Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-22 02:29:17 +02:00
Ulf Hansson	0ae3aeefab	PM / Runtime: Fix error path in pm_runtime_force_resume() As pm_runtime_set_active() may fail because the device's parent isn't active, we can end up executing the ->runtime_resume() callback for the device when it isn't allowed. Fix this by invoking pm_runtime_set_active() before running the callback and let's also deal with the error code. Fixes: `37f204164d` (PM: Add pm_runtime_suspend\|resume_force functions) Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2016-04-21 19:31:11 +02:00
Greg Kroah-Hartman	5614e77258	Merge 4.6-rc4 into driver-core-next We want those fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-19 04:28:28 +09:00
Aviya Erenfeld	522566376a	devcoredump: add scatterlist support Add scatterlist support (dev_coredumpsg) to allow drivers to avoid vmalloc() like dev_coredumpm(), while also avoiding the module reference that the latter function requires. This internally uses dev_coredumpm() with function inside the devcoredump module, requiring removing the const (which touches the driver using it.) Signed-off-by: Aviya Erenfeld <aviya.erenfeld@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-15 11:20:32 -07:00

1 2 3 4 5 ...

3201 Commits