linux

Author	SHA1	Message	Date
Xin Long	e90ce2fc27	dccp: fix a memleak for dccp_feat_init err process In dccp_feat_init, when ccid_get_builtin_ccids failsto alloc memory for rx.val, it should free tx.val before returning an error. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-27 00:01:05 -07:00
Xin Long	b7953d3c0e	dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properly The patch "dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly" fixed reqsk refcnt leak for dccp_ipv6. The same issue exists on dccp_ipv4. This patch is to fix it for dccp_ipv4. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-27 00:01:05 -07:00
Xin Long	0c2232b0a7	dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly In dccp_v6_conn_request, after reqsk gets alloced and hashed into ehash table, reqsk's refcnt is set 3. one is for req->rsk_timer, one is for hlist, and the other one is for current using. The problem is when dccp_v6_conn_request returns and finishes using reqsk, it doesn't put reqsk. This will cause reqsk refcnt leaks and reqsk obj never gets freed. Jianlin found this issue when running dccp_memleak.c in a loop, the system memory would run out. dccp_memleak.c: int s1 = socket(PF_INET6, 6, IPPROTO_IP); bind(s1, &sa1, 0x20); listen(s1, 0x9); int s2 = socket(PF_INET6, 6, IPPROTO_IP); connect(s2, &sa1, 0x20); close(s1); close(s2); This patch is to put the reqsk before dccp_v6_conn_request returns, just as what tcp_conn_request does. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-27 00:01:05 -07:00
Aneesh Kumar K.V	0da12a7a81	powerpc/mm/hash: Free the subpage_prot_table correctly Fixes: `dad6f37c26` ("powerpc: subpage_protect: Increase the array size to take care of 64TB") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Tested-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-07-27 13:05:50 +10:00
Dan Carpenter	e6fd916a62	scsi: aacraid: reading out of bounds "qd.id" comes directly from the copy_from_user() on the line before so we should verify that it's within bounds. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-07-26 22:09:21 -04:00
Thomas Bogendoerfer	722477c4f2	scsi: qedf: Limit number of CQs FCOE offloading failed with: [qed_sp_fcoe_func_start:150(sp-0-3b:00.02)]Cannot satisfy CQ amount. CQs requested 8, CQs available 6. Aborting function start [qed_fcoe_start:821()]Failed to start fcoe [__qedf_probe:3041]:6: Cannot start FCoE function. The reason is a newly introduced check in the qed main part. This change also provides the information about how many CQs are available, so we simply limit the number of requested CQs.. Fixes: `3c5da94278` ("qed: Share additional information with qedf") Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-07-26 22:00:39 -04:00
Thomas Gleixner	f9f22a8691	scsi: bnx2i: Simplify cpu hotplug code The CPU hotplug related code of this driver can be simplified by: 1) Consolidating the callbacks into a single state. The CPU thread can be torn down on the CPU which goes offline. There is no point in delaying that to the CPU dead state 2) Let the core code invoke the online/offline callbacks and remove the extra for_each_online_cpu() loops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-07-26 21:51:25 -04:00
Thomas Gleixner	1937f8a29f	scsi: bnx2fc: Simplify CPU hotplug code The CPU hotplug related code of this driver can be simplified by: 1) Consolidating the callbacks into a single state. The CPU thread can be torn down on the CPU which goes offline. There is no point in delaying that to the CPU dead state 2) Let the core code invoke the online/offline callbacks and remove the extra for_each_online_cpu() loops. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-07-26 21:51:25 -04:00
Thomas Gleixner	2fa2fa1ae6	scsi: bnx2i: Prevent recursive cpuhotplug locking The BNX2I module init/exit code installs/removes the hotplug callbacks with the cpu hotplug lock held. This worked with the old CPU locking implementation which allowed recursive locking, but with the new percpu rwsem based mechanism this is not longer allowed. Use the _cpuslocked() variants to fix this. Reported-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-07-26 21:51:24 -04:00
Thomas Gleixner	2c2b66ae9d	scsi: bnx2fc: Prevent recursive cpuhotplug locking The BNX2FC module init/exit code installs/removes the hotplug callbacks with the cpu hotplug lock held. This worked with the old CPU locking implementation which allowed recursive locking, but with the new percpu rwsem based mechanism this is not longer allowed. Use the _cpuslocked() variants to fix this. Reported-by: kernel test robot <fengguang.wu@intel.com> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-07-26 21:51:24 -04:00
Thomas Gleixner	8addebc14a	scsi: bnx2fc: Plug CPU hotplug race bnx2fc_process_new_cqes() has protection against CPU hotplug, which relies on the per cpu thread pointer. This protection is racy because it happens only partially with the per cpu fp_work_lock held. If the CPU is unplugged after the lock is dropped, the wakeup code can dereference a NULL pointer or access freed and potentially reused memory. Restructure the code so the thread check and wakeup happens with the fp_work_lock held. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2017-07-26 21:51:24 -04:00
Arnd Bergmann	7e17510018	drm: exynos: mark pm functions as __maybe_unused The rework of the exynos DRM clock handling introduced warnings for configurations that have CONFIG_PM disabled: drivers/gpu/drm/exynos/exynos_hdmi.c:736:13: error: 'hdmi_clk_disable_gates' defined but not used [-Werror=unused-function] static void hdmi_clk_disable_gates(struct hdmi_context hdata) ^~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/exynos/exynos_hdmi.c:717:12: error: 'hdmi_clk_enable_gates' defined but not used [-Werror=unused-function] static int hdmi_clk_enable_gates(struct hdmi_context hdata) The problem is that the PM functions themselves are inside of an #ifdef, but some functions they call are not. This patch removes the #ifdef and instead marks the PM functions as __maybe_unused, which is a more reliable way to get it right. Link: https://patchwork.kernel.org/patch/8436281/ Fixes: `9be7e98984` ("drm/exynos/hdmi: clock code re-factoring") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2017-07-27 09:24:03 +09:00
Hans Verkuil	8f4e01f9f0	drm/exynos: select CEC_CORE if CEC_NOTIFIER If the s5p-cec driver is a module and the drm exynos driver is built-in, then the CEC core will be a module also, causing the CEC notifier to fail (will be compiled as empty functions). To prevent this select CEC_CORE if CEC_NOTIFIER is set to ensure the CEC core is also built into the kernel. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2017-07-27 09:24:03 +09:00
Andrzej Hajda	861b27eca7	drm/exynos/hdmi: fix disable sequence The "Fixes" patch was incorrectly merged, as a result PHY is prematurely powered off and for example Odroid-U3 cannot disable TV power domain when HDMI cable is unplugged. Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: `625e63e2` ("drm/exynos/hdmi: fix pipeline disable order") Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2017-07-27 09:24:02 +09:00
Inki Dae	576d72fbfb	drm/exynos: mic: add a bridge at probe This patch moves drm_bridge_add call into probe. It doesn't need to call drm_bridge_add call every time bind callback is called. Changelog v2 - moved drm_bridge_remove call into remove callback. - corrected description. Suggested-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Hoegeun Kwon <hoegeun.kwon@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2017-07-27 09:24:02 +09:00
Hoegeun Kwon	0d51a0a534	drm/exynos/dsi: Remove error handling for bridge_node DT parsing Remove the error handling of bridge_node because the bridge_node is optional. For example, In case of Exynos SoC, a bridge device such as mDNIe and MIC could be placed between Display Controller and MIPI DSI device but the bridge device is optional. Signed-off-by: Hoegeun Kwon <hoegeun.kwon@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2017-07-27 09:24:02 +09:00
Inki Dae	c9948920cf	drm/exynos: dsi: do not try to find bridge It doesn't need to try to find a bridge if bridge node doesn't exist. Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com> Tested-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2017-07-27 09:24:01 +09:00
Arvind Yadav	e3cc51ea0b	drm: exynos: hdmi: make of_device_ids const. of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. File size before: text data bss dec hex filename 12294 1192 0 13486 34ae drivers/gpu/drm/exynos/exynos_hdmi.o File size after constify hdmi_match_types. text data bss dec hex filename 13318 176 0 13494 34b6 drivers/gpu/drm/exynos/exynos_hdmi.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2017-07-27 09:24:01 +09:00
Arvind Yadav	5e6cc1c588	drm: exynos: constify mixer_match_types and *_mxr_drv_data. File size before: text data bss dec hex filename 9983 1424 0 11407 2c8f drivers/gpu/drm/exynos/exynos_mixer.o File size after constify: text data bss dec hex filename 11231 176 0 11407 2c8f drivers/gpu/drm/exynos/exynos_mixer.o Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2017-07-27 09:24:01 +09:00
Gabriel Krisman Bertazi	1d6bb0f9b4	exynos_drm: Clean up duplicated assignment in exynos_drm_driver num_ioctls is already assigned when declaring the exynos_drm_driver structure. No need to duplicate it here. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>	2017-07-27 09:24:01 +09:00
Doug Ledford	f55c1e6608	Merge branches 'rxe' and 'mlx' into k.o/for-next	2017-07-26 20:13:33 -04:00
Jakub Kicinski	d777b2ddbe	bpf: don't zero out the info struct in bpf_obj_get_info_by_fd() The buffer passed to bpf_obj_get_info_by_fd() should be initialized to zeros. Kernel will enforce that to guarantee we can safely extend info structures in the future. Making the bpf_obj_get_info_by_fd() call in libbpf perform the zeroing is problematic, however, since some members of the info structures may need to be initialized by the callers (for instance pointers to buffers to which kernel is to dump translated and jited images). Remove the zeroing and fix up the in-tree callers before any kernel has been released with this code. As Daniel points out this seems to be the intended operation anyway, since commit `95b9afd398` ("bpf: Test for bpf ID") is itself setting the buffer pointers before calling bpf_obj_get_info_by_fd(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-26 17:02:52 -07:00
Matthias Kaehlcke	0c3a8f8b8f	netpoll: Fix device name check in netpoll_setup() Apparently netpoll_setup() assumes that netpoll.dev_name is a pointer when checking if the device name is set: if (np->dev_name) { ... However the field is a character array, therefore the condition always yields true. Check instead whether the first byte of the array has a non-zero value. Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-26 17:01:43 -07:00
David S. Miller	e27a879271	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-07-25 This series contains updates to i40e and i40evf only. Gustavo Silva fixes a variable assignment, where the incorrect variable was being used to store the error parameter. Carolyn provides a fix for a problem found in systems when entering S4 state, by ensuring that the misc vector's IRQ is disabled as well. Jake removes the single-threaded restriction on the module workqueue, which was causing issues with events such as CORER. Does some future proofing, by changing how the driver displays the UDP tunnel type. Paul adds a retry in releasing resources if the admin queue times out during the first attempt to release the resources. Jesse fixes up references to 32bit timspec, since there are a small set of errors on 32 bit, so we need to be using the right calls for dealing with timespec64 variables. Cleaned up code indentation and corrected an "if" conditional check, as well as making the code flow more clear. Cast or changed the types to remove warnings for comparing signed and unsigned types. Adds missing includes in i40evf, which were being used but were not being directly included. Daniel Borkmann fixes i40e to fill the XDP prog_id with the id just like other XDP enabled drivers, so that on dump we can retrieve the attached program based on the id and dump BPF insns, opcodes, etc back to user space. Tushar Dave adds le32_to_cpu while evaluating the hardware descriptor fields, since they are in little-endian format. Also removed unnecessary "__packed" to a couple of i40evf structures. Stefan Assmann fixes an issue when an administratively set MAC was set and should now be switched back to 00:00:00:00:00:00, the pf_set_mac flag is not being toggled back to false. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-26 16:58:45 -07:00
Hanjun Guo	512bb03f49	ACPI: processor: use dev_dbg() instead of dev_warn() when CPPC probe failed _CPC is a optinal object for processor device so it's fine for processor devices in DSDT without CPPC data, but when booting the system with CPPC enabled in the kernel but without its support in the firmware, I got lots of warnings on a 64 core system: [ 6.346016] acpi ACPI0007:00: CPPC data invalid or not present [ 6.346028] acpi ACPI0007:01: CPPC data invalid or not present [ 6.346039] acpi ACPI0007:02: CPPC data invalid or not present [ 6.346050] acpi ACPI0007:03: CPPC data invalid or not present [ 6.346063] acpi ACPI0007:04: CPPC data invalid or not present ... [ 6.346737] acpi ACPI0007:3f: CPPC data invalid or not present This isn't much useful and a little bit noise, so switch the dev_warn() to dev_dbg(). Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-27 01:51:06 +02:00
David S. Miller	143f0cf963	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2017-07-25 This series contains updates to ixgbe only. Tony provides all of the changes in the series, starting with adding a check to ensure that adding a MAC filter was successful, before setting the MACVLAN. In order to receive notifications of link configurations of the external PHY and support the configuration of the internal iXFI link on X552 devices, Tony enables LASI interrupts. Update the iXFI driver code flow, since the MAC register NW_MNG_IF_SEL fields have been redefined for X553 devices, so add MAC checks for iXFI flows. Added additional checks for flow control autonegotiation, since it is not support for X553 fiber and XFI devices. v2: removed unnecessary parens noticed by David Miller in patch 6 of the series. v3: dropped patch 6 of the original series, while we work out a more generic solution for malicious driver detection (MDD) support. v4: updated patch 1 of the series with the comments from Joe Perches which were: - switched logic to return on error - return 0 on success - declare retval as an integer ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-07-26 16:36:29 -07:00
Dave Airlie	517069ff6e	Merge branch 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux into drm-fixes Three misc amd fixes. * 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux: drm/amd/powerplay: fix AVFS voltage offset for Vega10 drm/amdgpu/gfx9: simplify and fix GRBM index selection drm/amdgpu: Fix blocking in RCU critical section(v2)	2017-07-27 08:49:48 +10:00
Sean Paul	78acea381d	Merge airlied/drm-next into drm-misc-next Backmerge drm-next with -rc2 in it to pull in a couple stm patches that were previously incorrectly applied to -misc-next. By picking them up in the correct manner, git will hopefully fix any errant trees that are out in the wild. Signed-off-by: Sean Paul <seanpaul@chromium.org>	2017-07-26 18:39:07 -04:00
Stephen Rothwell	e6742e1021	drm: linux-next: build failure after merge of the drm-misc tree Hi all, After merging the drm-misc tree, today's linux-next build (x86_64 allmodconfig) failed like this: drivers/staging/vboxvideo/vbox_drv.c:235:2: error: unknown field 'set_busid' specified in initializer .set_busid = drm_pci_set_busid, ^ drivers/staging/vboxvideo/vbox_drv.c:235:15: error: 'drm_pci_set_busid' undeclared here (not in a function) .set_busid = drm_pci_set_busid, ^ drivers/staging/vboxvideo/vbox_drv.c: In function 'vbox_init': drivers/staging/vboxvideo/vbox_drv.c:273:9: error: implicit declaration of function 'drm_pci_init' [-Werror=implicit-function-declaration] return drm_pci_init(&driver, &vbox_pci_driver); ^ drivers/staging/vboxvideo/vbox_drv.c: In function 'vbox_exit': drivers/staging/vboxvideo/vbox_drv.c:278:2: error: implicit declaration of function 'drm_pci_exit' [-Werror=implicit-function-declaration] drm_pci_exit(&driver, &vbox_pci_driver); ^ Caused by commits `5c484cee7e` ("drm: Remove drm_driver->set_busid hook") `10631d724d` ("drm/pci: Deprecate drm_pci_init/exit completely") interacting with commit `dd55d44f40` ("staging: vboxvideo: Add vboxvideo to drivers/staging") from the staging.current tree. I have applied the following merge fix patch - please check that it is correct. From: Stephen Rothwell <sfr@canb.auug.org.au> Date: Wed, 19 Jul 2017 11:41:01 +1000 Subject: [PATCH] drm: fixes for staging due to API changes in the drm core Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-27 08:27:11 +10:00
Dave Airlie	0eb2c0ae57	Backmerge tag 'v4.13-rc2' into drm-next Linux 4.13-rc2 This is required for drm-misc fixing.	2017-07-27 08:15:43 +10:00
Masami Hiramatsu	97e4936851	selftests: ftrace: Check given string is not zero-length Use [ ! -z "$VAR" ] instead of [ "$VAR" ] to check whether the given string variable is not zero-length since it obviously shows what it means. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <srostedt@goodmis.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>	2017-07-26 15:41:43 -06:00
Masami Hiramatsu	97bece60ef	selftests: ftrace: Output only to console with "--logdir -" Output logs only to console if "-" is given to --logdir option. In this case, ftracetest doesn't record any log on the disk, and all logs immediately shown (including all command logs.) Since there is no "tee" in the middle of command and console, it outputs the log really soon. This option is useful only when the console is logged. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <srostedt@goodmis.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>	2017-07-26 15:41:36 -06:00
Masami Hiramatsu	dab24fb1f2	selftests: ftrace: Add more verbosity for immediate log Add 3-level verbosity for showing traced command log on console immediately. Since some test cases can cause kernel pacic if there is a probrem (like regression etc.), we can not know which command caused the problem without traced command log. This verbosity (-vvv) solves that because it shows the log on console immediately. User can get continuous command/error log. Note that this is a kind of kernel debug mode, if you don't see any kernel related issue, you don't need this verbosity. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <srostedt@goodmis.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>	2017-07-26 15:41:30 -06:00
Masami Hiramatsu	9aa9413912	selftests: ftrace: Add --fail-unsupported option Add --fail-unsupported option to fail the test result if ftracetest gets UNSUPPORTED result. UNSUPPORTED usually happens when the kernel is old (e.g. stable tree) or some kernel feature is disabled. However, if newer kernel has any bug or regression, it can make test results in UNSUPPORTED too. This option can detect such kernel regression. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <srostedt@goodmis.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>	2017-07-26 15:41:22 -06:00
Masami Hiramatsu	9b682cd4af	selftests: ftrace: Do not failure if there is unsupported tests Do not return failure exit code (1) for unsupported testcases, since it is expected for stable kernels. Previously, ftracetest is expected to run only on current release for avoiding regressions. However, nowadays we run it on stable kernels. This means some test cases must return unsupported result. In such case, we should NOT exit ftracetest with error status for unsupported results so that kselftest (upper tests wrapper) shows it passed correctly. Note that we continue to treat unresolved results as failure, if test writers would like to notice user that the test result should be reviewed, they can use exit_unresolved. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <srostedt@goodmis.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>	2017-07-26 15:41:13 -06:00
Dennis Zhou (Facebook)	5e81ee3e6a	percpu: update header to contain bitmap allocator explanation. The other patches contain a lot of information, so adding this information in a separate patch. It adds my copyright and a brief explanation of how the bitmap allocator works. There is a minor typo as well in the prior explanation so that is fixed. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-26 17:41:06 -04:00
Dennis Zhou (Facebook)	b4c2116cfa	percpu: update pcpu_find_block_fit to use an iterator The simple, and expensive, way to find a free area is to iterate over the entire bitmap until an area is found that fits the allocation size and alignment. This patch makes use of an iterate that find an area to check by using the block level contig hints. It will only return an area that can fit the size and alignment request. If the request can fit inside a block, it returns the first_free bit to start checking from to see if it can be fulfilled prior to the contig hint. The pcpu_alloc_area check has a bound of a block size added in case it is wrong. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-26 17:41:06 -04:00
Dennis Zhou (Facebook)	525ca84dae	percpu: use metadata blocks to update the chunk contig hint The largest free region will either be a block level contig hint or an aggregate over the left_free and right_free areas of blocks. This is a much smaller set of free areas that need to be checked than a full traverse. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-26 17:41:06 -04:00
Dennis Zhou (Facebook)	b185cd0dc6	percpu: update free path to take advantage of contig hints The bitmap allocator must keep metadata consistent. The easiest way is to scan after every allocation for each affected block and the entire chunk. This is rather expensive. The free path can take advantage of current contig hints to prevent scanning within the start and end block. If a scan is needed, it can be done by scanning backwards from the start and forwards from the end to identify the entire free area this can be combined with. The blocks can then be updated by some basic checks rather than complete block scans. A chunk scan happens when the freed area makes a page free, a block free, or spans across blocks. This is necessary as the contig hint at this point could span across blocks. The check uses the minimum of page size and the block size to allow for variable sized blocks. There is a tradeoff here with not updating after every free. It is possible a contig hint in one block can be merged with the contig hint in the next block. This means the contig hint can be off by up to a page. However, if the chunk's contig hint is contained in one block, the contig hint will be accurate. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-26 17:41:06 -04:00
Dennis Zhou (Facebook)	fc3043345a	percpu: update alloc path to only scan if contig hints are broken Metadata is kept per block to keep track of where the contig hints are. Scanning can be avoided when the contig hints are not broken. In that case, left and right contigs have to be managed manually. This patch changes the allocation path hint updating to only scan when contig hints are broken. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-26 17:41:06 -04:00
Dennis Zhou (Facebook)	268625a6f9	percpu: keep track of the best offset for contig hints This patch makes the contig hint starting offset optimization from the previous patch as honest as it can be. For both chunk and block starting offsets, make sure it keeps the starting offset with the best alignment. The block skip optimization is added in a later patch when the pcpu_find_block_fit iterator is swapped in. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-26 17:41:05 -04:00
Dennis Zhou (Facebook)	13f966373f	percpu: skip chunks if the alloc does not fit in the contig hint This patch adds chunk->contig_bits_start to keep track of the contig hint's offset and the check to skip the chunk if it does not fit. If the chunk's contig hint starting offset cannot satisfy an allocation, the allocator assumes there is enough memory pressure in this chunk to either use a different chunk or create a new one. This accepts a less tight packing for a smoother latency curve. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-26 17:41:05 -04:00
Dennis Zhou (Facebook)	86b442fbce	percpu: add first_bit to keep track of the first free in the bitmap This patch adds first_bit to keep track of the first free bit in the bitmap. This hint helps prevent scanning of fully allocated blocks. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-26 17:41:05 -04:00
Dennis Zhou (Facebook)	ca460b3c96	percpu: introduce bitmap metadata blocks This patch introduces the bitmap metadata blocks and adds the skeleton of the code that will be used to maintain these blocks. Each chunk's bitmap is made up of full metadata blocks. These blocks maintain basic metadata to help prevent scanning unnecssarily to update hints. Full scanning methods are used for the skeleton and will be replaced in the coming patches. A number of helper functions are added as well to do conversion of pages to blocks and manage offsets. Comments will be updated as the final version of each function is added. There exists a relationship between PAGE_SIZE, PCPU_BITMAP_BLOCK_SIZE, the region size, and unit_size. Every chunk's region (including offsets) is page aligned at the beginning to preserve alignment. The end is aligned to LCM(PAGE_SIZE, PCPU_BITMAP_BLOCK_SIZE) to ensure that the end can fit with the populated page map which is by page and every metadata block is fully accounted for. The unit_size is already page aligned, but must also be aligned with PCPU_BITMAP_BLOCK_SIZE to ensure full metadata blocks. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-26 17:41:05 -04:00
Dennis Zhou (Facebook)	40064aeca3	percpu: replace area map allocator with bitmap The percpu memory allocator is experiencing scalability issues when allocating and freeing large numbers of counters as in BPF. Additionally, there is a corner case where iteration is triggered over all chunks if the contig_hint is the right size, but wrong alignment. This patch replaces the area map allocator with a basic bitmap allocator implementation. Each subsequent patch will introduce new features and replace full scanning functions with faster non-scanning options when possible. Implementation: This patchset removes the area map allocator in favor of a bitmap allocator backed by metadata blocks. The primary goal is to provide consistency in performance and memory footprint with a focus on small allocations (< 64 bytes). The bitmap removes the heavy memmove from the freeing critical path and provides a consistent memory footprint. The metadata blocks provide a bound on the amount of scanning required by maintaining a set of hints. In an effort to make freeing fast, the metadata is updated on the free path if the new free area makes a page free, a block free, or spans across blocks. This causes the chunk's contig hint to potentially be smaller than what it could allocate by up to the smaller of a page or a block. If the chunk's contig hint is contained within a block, a check occurs and the hint is kept accurate. Metadata is always kept accurate on allocation, so there will not be a situation where a chunk has a later contig hint than available. Evaluation: I have primarily done testing against a simple workload of allocation of 1 million objects (2^20) of varying size. Deallocation was done by in order, alternating, and in reverse. These numbers were collected after rebasing ontop of `a80099a152`. I present the worst-case numbers here: Area Map Allocator: Object Size \| Alloc Time (ms) \| Free Time (ms) ---------------------------------------------- 4B \| 310 \| 4770 16B \| 557 \| 1325 64B \| 436 \| 273 256B \| 776 \| 131 1024B \| 3280 \| 122 Bitmap Allocator: Object Size \| Alloc Time (ms) \| Free Time (ms) ---------------------------------------------- 4B \| 490 \| 70 16B \| 515 \| 75 64B \| 610 \| 80 256B \| 950 \| 100 1024B \| 3520 \| 200 This data demonstrates the inability for the area map allocator to handle less than ideal situations. In the best case of reverse deallocation, the area map allocator was able to perform within range of the bitmap allocator. In the worst case situation, freeing took nearly 5 seconds for 1 million 4-byte objects. The bitmap allocator dramatically improves the consistency of the free path. The small allocations performed nearly identical regardless of the freeing pattern. While it does add to the allocation latency, the allocation scenario here is optimal for the area map allocator. The area map allocator runs into trouble when it is allocating in chunks where the latter half is full. It is difficult to replicate this, so I present a variant where the pages are second half filled. Freeing was done sequentially. Below are the numbers for this scenario: Area Map Allocator: Object Size \| Alloc Time (ms) \| Free Time (ms) ---------------------------------------------- 4B \| 4118 \| 4892 16B \| 1651 \| 1163 64B \| 598 \| 285 256B \| 771 \| 158 1024B \| 3034 \| 160 Bitmap Allocator: Object Size \| Alloc Time (ms) \| Free Time (ms) ---------------------------------------------- 4B \| 481 \| 67 16B \| 506 \| 69 64B \| 636 \| 75 256B \| 892 \| 90 1024B \| 3262 \| 147 The data shows a parabolic curve of performance for the area map allocator. This is due to the memmove operation being the dominant cost with the lower object sizes as more objects are packed in a chunk and at higher object sizes, the traversal of the chunk slots is the dominating cost. The bitmap allocator suffers this problem as well. The above data shows the inability to scale for the allocation path with the area map allocator and that the bitmap allocator demonstrates consistent performance in general. The second problem of additional scanning can result in the area map allocator completing in 52 minutes when trying to allocate 1 million 4-byte objects with 8-byte alignment. The same workload takes approximately 16 seconds to complete for the bitmap allocator. V2: Fixed a bug in pcpu_alloc_first_chunk end_offset was setting the bitmap using bytes instead of bits. Added a comment to pcpu_cnt_pop_pages to explain bitmap_weight. Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2017-07-26 17:41:05 -04:00
Kees Cook	c7fea48876	lkdtm: Provide timing tests for atomic_t vs refcount_t While not a crash test, this does provide two tight atomic_t and refcount_t loops for performance comparisons: cd /sys/kernel/debug/provoke-crash perf stat -B -- cat <(echo ATOMIC_TIMING) > DIRECT perf stat -B -- cat <(echo REFCOUNT_TIMING) > DIRECT Looking a CPU cycles is the best way to example the fast-path (rather than instruction counts, since conditional jumps will be executed but will be negligible due to branch-prediction). Signed-off-by: Kees Cook <keescook@chromium.org>	2017-07-26 14:38:04 -07:00
Kees Cook	95925c99b9	lkdtm: Provide more complete coverage for REFCOUNT tests The existing REFCOUNT_* LKDTM tests were designed only for testing a narrow portion of CONFIG_REFCOUNT_FULL. This moves the tests to their own file and expands their testing to poke each boundary condition. Since the protections (CONFIG_REFCOUNT_FULL and x86-fast) use different saturation values and reach-zero behavior, those have to be build-time set so the tests can actually validate things are happening at the right places. Notably, the x86-fast protection will fail REFCOUNT_INC_ZERO and REFCOUNT_ADD_ZERO since those conditions are not checked (only overflow is critical to protecting refcount_t). CONFIG_REFCOUNT_FULL will warn for each REFCOUNT_*_NEGATIVE test since it provides zero-pinning behaviors (which allows it to pass REFCOUNT_INC_ZERO and REFCOUNT_ADD_ZERO). Signed-off-by: Kees Cook <keescook@chromium.org>	2017-07-26 14:38:03 -07:00
Julia Lawall	38c1c6a9f3	cpufreq: s5pv210: add missing of_node_put() for_each_compatible_node performs an of_node_get on each iteration, so a return from the loop requires an of_node_put. The semantic patch that fixes this problem is as follows (http://coccinelle.lip6.fr): // <smpl> @@ local idexpression n; expression e,e1,e2; statement S; iterator i1; iterator name for_each_compatible_node; @@ for_each_compatible_node(n,e1,e2) { ... ( of_node_put(n); \| e = n \| return n; \| i1(...,n,...) S \| + of_node_put(n); ? return ...; ) ... } // </smpl> Additionally, call of_node_put on the previous value of np, obtained from of_find_compatible_node, that is no longer accessible at the point of the for_each_compatible_node. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-26 22:54:01 +02:00
Anna Schumaker	1e6f209515	NFS: Use raw NFS access mask in nfs4_opendata_access() Commit `bd8b244174` ("NFS: Store the raw NFS access mask in the inode's access cache") changed how the access results are stored after an access() call. An NFS v4 OPEN might have access bits returned with the opendata, so we should use the NFS4_ACCESS values when determining the return value in nfs4_opendata_access(). Fixes: `bd8b244174` ("NFS: Store the raw NFS access mask in the inode's access cache") Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Tested-by: Takashi Iwai <tiwai@suse.de>	2017-07-26 16:53:57 -04:00
Joel Fernandes	251accf985	cpufreq: schedutil: Use unsigned int for iowait boost Make iowait_boost and iowait_boost_max as unsigned int since its unit is kHz and this is consistent with struct cpufreq_policy. Also change the local variables in sugov_iowait_boost() to match this. Signed-off-by: Joel Fernandes <joelaf@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-07-26 22:52:13 +02:00

... 203 204 205 206 207 ...

704772 Commits