linux

Author	SHA1	Message	Date
Christoph Hellwig	250eec9e39	Documentation/hdio: fix up obscure bd_contains references bd_contains is an implementation detail and should not be mentioned in a userspace API documentation. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-09-25 08:18:57 -06:00
Jeffle Xu	3aab91774b	block: remove unused BLK_QC_T_EAGAIN flag commit `7b6620d7db` ("block: remove REQ_NOWAIT_INLINE") removed the REQ_NOWAIT_INLINE related code, but the diff wasn't applied to blk_types.h somehow. Then commit `2771cefeac` ("block: remove the REQ_NOWAIT_INLINE flag") removed the REQ_NOWAIT_INLINE flag while the BLK_QC_T_EAGAIN flag still remains. Fixes: `7b6620d7db` ("block: remove REQ_NOWAIT_INLINE") Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-09-25 07:54:50 -06:00
Jens Axboe	163090c14a	Merge branch 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-5.10/drivers Pull MD updates from Song. * 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md: md/raid10: improve discard request for far layout md/raid10: improve raid10 discard request md/raid10: pull codes that wait for blocked dev into one function md/raid10: extend r10bio devs to raid disks md: add md_submit_discard_bio() for submitting discard bio md: Simplify code with existing definition RESYNC_SECTORS in raid10.c md/raid5: reallocate page array after setting new stripe_size md/raid5: resize stripe_head when reshape array md/raid5: let multiple devices of stripe_head share page md/raid6: let async recovery function support different page offset md/raid6: let syndrome computor support different page offset md/raid5: convert to new xor compution interface md/raid5: add new xor function to support different page offset md/raid5: make async_copy_data() to support different page offset md/raid5: add a new member of offset into r5dev md: only calculate blocksize once and use i_blocksize()	2020-09-25 07:48:20 -06:00
Jens Axboe	f3cd485050	io_uring: ensure open/openat2 name is cleaned on cancelation If we cancel these requests, we'll leak the memory associated with the filename. Add them to the table of ops that need cleaning, if REQ_F_NEED_CLEANUP is set. Cc: stable@vger.kernel.org Fixes: `e62753e4e2` ("io_uring: call statx directly") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-09-25 07:41:46 -06:00
Jon Hunter	61f764c307	eeprom: at24: Support custom device names for AT24 EEPROMs By using the label property, a more descriptive name can be populated for AT24 EEPROMs NVMEM device. Update the AT24 driver to check to see if the label property is present and if so, use this as the name for NVMEM device. Please note that when the 'label' property is present for the AT24 EEPROM, we do not want the NVMEM driver to append the 'devid' to the name and so the nvmem_config.id is initialised to NVMEM_DEVID_NONE. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>	2020-09-25 15:33:04 +02:00
Sean Christopherson	8d214c4816	KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE Reset the MMU context during kvm_set_cr4() if SMAP or PKE is toggled. Recent commits to (correctly) not reload PDPTRs when SMAP/PKE are toggled inadvertantly skipped the MMU context reset due to the mask of bits that triggers PDPTR loads also being used to trigger MMU context resets. Fixes: `427890aff8` ("kvm: x86: Toggling CR4.SMAP does not load PDPTEs in PAE mode") Fixes: `cb957adb4e` ("kvm: x86: Toggling CR4.PKE does not load PDPTEs in PAE mode") Cc: Jim Mattson <jmattson@google.com> Cc: Peter Shier <pshier@google.com> Cc: Oliver Upton <oupton@google.com> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200923215352.17756-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-09-25 08:56:35 -04:00
Srinivas Kandagatla	709ec3f7fc	slimbus: qcom-ngd-ctrl: disable ngd in qmi server down callback In QMI new server notification we enable the NGD however during delete server notification we do not disable the NGD. This can lead to multiple instances of NGD being enabled, so make sure that we disable NGD in delete server callback to fix this issue! Fixes: `917809e228` ("slimbus: ngd: Add qcom SLIMBus NGD driver") Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20200925095520.27316-4-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-25 14:41:51 +02:00
Srinivas Kandagatla	df2c471c4a	slimbus: core: do not enter to clock pause mode in core Let the controller logic decide when to enter into clock pause mode! Entering in to pause mode during unregistration does not really make sense as the controller is totally going down at that point in time. Fixes: `4b14e62ad3` ("slimbus: Add support for 'clock-pause' feature") Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20200925095520.27316-3-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-25 14:41:50 +02:00
Srinivas Kandagatla	f97769fde6	slimbus: core: check get_addr before removing laddr ida logical address can be either assigned by the SLIMBus controller or the core. Core uses IDA in cases where get_addr callback is not provided by the controller. Core already has this check while allocating IDR, however during absence reporting this is not checked. This patch fixes this issue. Fixes: `46a2bb5a7f` ("slimbus: core: Add slim controllers support") Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20200925095520.27316-2-srinivas.kandagatla@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-25 14:41:50 +02:00
Greg Kroah-Hartman	5a487cf7ef	This tag contains the following changes for kernel 5.10-rc1: - release the kernel context object after we reset the device. This is needed to prevent a race where the firmware still has some in-flight transcations that require the kernel context (and its memory mappings) to be alive. - replace constant numbers with defines in QMAN initialization of GAUDI - correct an error message text and add a few debug messages to help debug issues that happen during context open and close. -----BEGIN PGP SIGNATURE----- iQFKBAABCgA0FiEE7TEboABC71LctBLFZR1NuKta54AFAl9t2h4WHG9kZWQuZ2Fi YmF5QGdtYWlsLmNvbQAKCRBlHU24q1rngKJTCAClbA79YeWImufwUr2kcs3OhPmI QeBCpiAMM/1chKTTLjh5RiAGWYTjEpwRn3DXunnrwdfeqvSoYPw7GxgP/oWKbY9M ISdGS5igw7VaFhG6PF2IpmrRt3R8qV+QOhsto54zt9ZkUwwHw2vzkFYrFYUWa1Rq g6NBYO9O2O07EK/wfVln9mDNMjsHSaY1hltOfY6vmU00psWGGGDYUg0QpOUJeZBw 1+Aw5jQPmflfTdcOzEO2Dsts4mVXiLVQxgvaohTXVTbhIXvUZA+aKMj8Xd3snghT /SehLKHi0LWUvzpsSNAjk4nw+kAcKatP8aKjpFfKcGoIc805E6fN4tAj5uTb =0VFf -----END PGP SIGNATURE----- Merge tag 'misc-habanalabs-next-2020-09-25' of git://people.freedesktop.org/~gabbayo/linux into char-misc-next Oded writes: This tag contains the following changes for kernel 5.10-rc1: - release the kernel context object after we reset the device. This is needed to prevent a race where the firmware still has some in-flight transcations that require the kernel context (and its memory mappings) to be alive. - replace constant numbers with defines in QMAN initialization of GAUDI - correct an error message text and add a few debug messages to help debug issues that happen during context open and close. * tag 'misc-habanalabs-next-2020-09-25' of git://people.freedesktop.org/~gabbayo/linux: habanalabs/gaudi: configure QMAN LDMA registers properly habanalabs: add notice of device not idle habanalabs: add debug messages for opening/closing context habanalabs: release kernel context after hw_fini habanalabs: correct an error message	2020-09-25 14:36:48 +02:00
Arnd Bergmann	1c954540c0	staging: vchiq: avoid mixing kernel and user pointers As found earlier, there is a problem in the create_pagelist() function that takes a pointer argument that either points into vmalloc space or into user space, with the pointer value controlled by user space allowing a malicious user to trick the driver into accessing the kernel instead. Avoid this problem by adding another function argument and passing kernel pointers separately from user pointers. This makes it possible to rely on sparse to point out invalid conversions, and it prevents user space from faking a kernel pointer. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20200925114424.2647144-2-arnd@arndb.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-25 14:34:03 +02:00
Arnd Bergmann	4184da4f31	staging: vchiq: fix __user annotations My earlier patches caused some new sparse warnings, but it turns out that a number of those are actual bugs, or at least suspicous code. Adding __user annotations to the data structures that are defined in uapi headers helps avoid the new warnings, but that causes a different set of warnings to show up, as some of these structures are used both inside of the kernel and at the user interface but storing pointers to different things there. Duplicating the vchiq_service_params and vchiq_completion_data structures in turn takes care of most of those, and then it turns out that there is a 'data' pointer that can be any of a __user address, a dmd_addr_t and a kernel pointer in vmalloc space at times. I'm trying to annotate these as best I can without changing behavior, but there still seems to be a serious bug when user space passes a valid vmalloc space address instead of a user pointer. Adding comments in the code there, and leaving the warnings in place that seem to correspond to actual bugs. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20200925114424.2647144-1-arnd@arndb.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-25 14:34:03 +02:00
Peter Oskolkov	f166b111e0	rseq/selftests: Test MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ Based on Google-internal RSEQ work done by Paul Turner and Andrew Hunter. This patch adds a selftest for MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ. The test quite often fails without the previous patch in this patchset, but consistently passes with it. Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lkml.kernel.org/r/20200923233618.2572849-3-posk@google.com	2020-09-25 14:23:27 +02:00
Peter Oskolkov	ea366dd79c	rseq/selftests,x86_64: Add rseq_offset_deref_addv() This patch adds rseq_offset_deref_addv() function to tools/testing/selftests/rseq/rseq-x86.h, to be used in a selftest in the next patch in the patchset. Once an architecture adds support for this function they should define "RSEQ_ARCH_HAS_OFFSET_DEREF_ADDV". Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lkml.kernel.org/r/20200923233618.2572849-2-posk@google.com	2020-09-25 14:23:27 +02:00
Peter Oskolkov	2a36ab717e	rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ This patchset is based on Google-internal RSEQ work done by Paul Turner and Andrew Hunter. When working with per-CPU RSEQ-based memory allocations, it is sometimes important to make sure that a global memory location is no longer accessed from RSEQ critical sections. For example, there can be two per-CPU lists, one is "active" and accessed per-CPU, while another one is inactive and worked on asynchronously "off CPU" (e.g. garbage collection is performed). Then at some point the two lists are swapped, and a fast RCU-like mechanism is required to make sure that the previously active list is no longer accessed. This patch introduces such a mechanism: in short, membarrier() syscall issues an IPI to a CPU, restarting a potentially active RSEQ critical section on the CPU. Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lkml.kernel.org/r/20200923233618.2572849-1-posk@google.com	2020-09-25 14:23:27 +02:00
Barry Song	233e7aca4c	sched/fair: Use dst group while checking imbalance for NUMA balancer Barry Song noted the following Something is wrong. In find_busiest_group(), we are checking if src has higher load, however, in task_numa_find_cpu(), we are checking if dst will have higher load after balancing. It seems it is not sensible to check src. It maybe cause wrong imbalance value, for example, if dst_running = env->dst_stats.nr_running + 1 results in 3 or above, and src_running = env->src_stats.nr_running - 1 results in 1; The current code is thinking imbalance as 0 since src_running is smaller than 2. This is inconsistent with load balancer. Basically, in find_busiest_group(), the NUMA imbalance is ignored if moving a task "from an almost idle domain" to a "domain with spare capacity". This patch forbids movement "from a misplaced domain" to "an almost idle domain" as that is closer to what the CPU load balancer expects. This patch is not a universal win. The old behaviour was intended to allow a task from an almost idle NUMA node to migrate to its preferred node if the destination had capacity but there are corner cases. For example, a NAS compute load could be parallelised to use 1/3rd of available CPUs but not all those potential tasks are active at all times allowing this logic to trigger. An obvious example is specjbb 2005 running various numbers of warehouses on a 2 socket box with 80 cpus. specjbb 5.9.0-rc4 5.9.0-rc4 vanilla dstbalance-v1r1 Hmean tput-1 46425.00 ( 0.00%) 43394.00 * -6.53%* Hmean tput-2 98416.00 ( 0.00%) 96031.00 * -2.42%* Hmean tput-3 150184.00 ( 0.00%) 148783.00 * -0.93%* Hmean tput-4 200683.00 ( 0.00%) 197906.00 * -1.38%* Hmean tput-5 236305.00 ( 0.00%) 245549.00 * 3.91%* Hmean tput-6 281559.00 ( 0.00%) 285692.00 * 1.47%* Hmean tput-7 338558.00 ( 0.00%) 334467.00 * -1.21%* Hmean tput-8 340745.00 ( 0.00%) 372501.00 * 9.32%* Hmean tput-9 424343.00 ( 0.00%) 413006.00 * -2.67%* Hmean tput-10 421854.00 ( 0.00%) 434261.00 * 2.94%* Hmean tput-11 493256.00 ( 0.00%) 485330.00 * -1.61%* Hmean tput-12 549573.00 ( 0.00%) 529959.00 * -3.57%* Hmean tput-13 593183.00 ( 0.00%) 555010.00 * -6.44%* Hmean tput-14 588252.00 ( 0.00%) 599166.00 * 1.86%* Hmean tput-15 623065.00 ( 0.00%) 642713.00 * 3.15%* Hmean tput-16 703924.00 ( 0.00%) 660758.00 * -6.13%* Hmean tput-17 666023.00 ( 0.00%) 697675.00 * 4.75%* Hmean tput-18 761502.00 ( 0.00%) 758360.00 * -0.41%* Hmean tput-19 796088.00 ( 0.00%) 798368.00 * 0.29%* Hmean tput-20 733564.00 ( 0.00%) 823086.00 * 12.20%* Hmean tput-21 840980.00 ( 0.00%) 856711.00 * 1.87%* Hmean tput-22 804285.00 ( 0.00%) 872238.00 * 8.45%* Hmean tput-23 795208.00 ( 0.00%) 889374.00 * 11.84%* Hmean tput-24 848619.00 ( 0.00%) 966783.00 * 13.92%* Hmean tput-25 750848.00 ( 0.00%) 903790.00 * 20.37%* Hmean tput-26 780523.00 ( 0.00%) 962254.00 * 23.28%* Hmean tput-27 1042245.00 ( 0.00%) 991544.00 * -4.86%* Hmean tput-28 1090580.00 ( 0.00%) 1035926.00 * -5.01%* Hmean tput-29 999483.00 ( 0.00%) 1082948.00 * 8.35%* Hmean tput-30 1098663.00 ( 0.00%) 1113427.00 * 1.34%* Hmean tput-31 1125671.00 ( 0.00%) 1134175.00 * 0.76%* Hmean tput-32 968167.00 ( 0.00%) 1250286.00 * 29.14%* Hmean tput-33 1077676.00 ( 0.00%) 1060893.00 * -1.56%* Hmean tput-34 1090538.00 ( 0.00%) 1090933.00 * 0.04%* Hmean tput-35 967058.00 ( 0.00%) 1107421.00 * 14.51%* Hmean tput-36 1051745.00 ( 0.00%) 1210663.00 * 15.11%* Hmean tput-37 1019465.00 ( 0.00%) 1351446.00 * 32.56%* Hmean tput-38 1083102.00 ( 0.00%) 1064541.00 * -1.71%* Hmean tput-39 1232990.00 ( 0.00%) 1303623.00 * 5.73%* Hmean tput-40 1175542.00 ( 0.00%) 1340943.00 * 14.07%* Hmean tput-41 1127826.00 ( 0.00%) 1339492.00 * 18.77%* Hmean tput-42 1198313.00 ( 0.00%) 1411023.00 * 17.75%* Hmean tput-43 1163733.00 ( 0.00%) 1228253.00 * 5.54%* Hmean tput-44 1305562.00 ( 0.00%) 1357886.00 * 4.01%* Hmean tput-45 1326752.00 ( 0.00%) 1406061.00 * 5.98%* Hmean tput-46 1339424.00 ( 0.00%) 1418451.00 * 5.90%* Hmean tput-47 1415057.00 ( 0.00%) 1381570.00 * -2.37%* Hmean tput-48 1392003.00 ( 0.00%) 1421167.00 * 2.10%* Hmean tput-49 1408374.00 ( 0.00%) 1418659.00 * 0.73%* Hmean tput-50 1359822.00 ( 0.00%) 1391070.00 * 2.30%* Hmean tput-51 1414246.00 ( 0.00%) 1392679.00 * -1.52%* Hmean tput-52 1432352.00 ( 0.00%) 1354020.00 * -5.47%* Hmean tput-53 1387563.00 ( 0.00%) 1409563.00 * 1.59%* Hmean tput-54 1406420.00 ( 0.00%) 1388711.00 * -1.26%* Hmean tput-55 1438804.00 ( 0.00%) 1387472.00 * -3.57%* Hmean tput-56 1399465.00 ( 0.00%) 1400296.00 * 0.06%* Hmean tput-57 1428132.00 ( 0.00%) 1396399.00 * -2.22%* Hmean tput-58 1432385.00 ( 0.00%) 1386253.00 * -3.22%* Hmean tput-59 1421612.00 ( 0.00%) 1371416.00 * -3.53%* Hmean tput-60 1429423.00 ( 0.00%) 1389412.00 * -2.80%* Hmean tput-61 1396230.00 ( 0.00%) 1351122.00 * -3.23%* Hmean tput-62 1418396.00 ( 0.00%) 1383098.00 * -2.49%* Hmean tput-63 1409918.00 ( 0.00%) 1374662.00 * -2.50%* Hmean tput-64 1410236.00 ( 0.00%) 1376216.00 * -2.41%* Hmean tput-65 1396405.00 ( 0.00%) 1364418.00 * -2.29%* Hmean tput-66 1395975.00 ( 0.00%) 1357326.00 * -2.77%* Hmean tput-67 1392986.00 ( 0.00%) 1349642.00 * -3.11%* Hmean tput-68 1386541.00 ( 0.00%) 1343261.00 * -3.12%* Hmean tput-69 1374407.00 ( 0.00%) 1342588.00 * -2.32%* Hmean tput-70 1377513.00 ( 0.00%) 1334654.00 * -3.11%* Hmean tput-71 1369319.00 ( 0.00%) 1334952.00 * -2.51%* Hmean tput-72 1354635.00 ( 0.00%) 1329005.00 * -1.89%* Hmean tput-73 1350933.00 ( 0.00%) 1318942.00 * -2.37%* Hmean tput-74 1351714.00 ( 0.00%) 1316347.00 * -2.62%* Hmean tput-75 1352198.00 ( 0.00%) 1309974.00 * -3.12%* Hmean tput-76 1349490.00 ( 0.00%) 1286064.00 * -4.70%* Hmean tput-77 1336131.00 ( 0.00%) 1303684.00 * -2.43%* Hmean tput-78 1308896.00 ( 0.00%) 1271024.00 * -2.89%* Hmean tput-79 1326703.00 ( 0.00%) 1290862.00 * -2.70%* Hmean tput-80 1336199.00 ( 0.00%) 1291629.00 * -3.34%* The performance at the mid-point is better but not universally better. The patch is a mixed bag depending on the workload, machine and overall levels of utilisation. Sometimes it's better (sometimes much better), other times it is worse (sometimes much worse). Given that there isn't a universally good decision in this section and more people seem to prefer the patch then it may be best to keep the LB decisions consistent and revisit imbalance handling when the load balancer code changes settle down. Jirka Hladky added the following observation. Our results are mostly in line with what you see. We observe big gains (20-50%) when the system is loaded to 1/3 of the maximum capacity and mixed results at the full load - some workloads benefit from the patch at the full load, others not, but performance changes at the full load are mostly within the noise of results (+/-5%). Overall, we think this patch is helpful. [mgorman@techsingularity.net: Rewrote changelog] Fixes: `fb86f5b211` ("sched/numa: Use similar logic to the load balancer for moving between domains with spare capacity") Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200921221849.GI3179@techsingularity.net	2020-09-25 14:23:26 +02:00
Vincent Guittot	6e7499135d	sched/fair: Reduce busy load balance interval The busy_factor, which increases load balance interval when a cpu is busy, is set to 32 by default. This value generates some huge LB interval on large system like the THX2 made of 2 node x 28 cores x 4 threads. For such system, the interval increases from 112ms to 3584ms at MC level. And from 228ms to 7168ms at NUMA level. Even on smaller system, a lower busy factor has shown improvement on the fair distribution of the running time so let reduce it for all. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/20200921072424.14813-5-vincent.guittot@linaro.org	2020-09-25 14:23:26 +02:00
Vincent Guittot	e4d32e4d54	sched/fair: Minimize concurrent LBs between domain level sched domains tend to trigger simultaneously the load balance loop but the larger domains often need more time to collect statistics. This slowness makes the larger domain trying to detach tasks from a rq whereas tasks already migrated somewhere else at a sub-domain level. This is not a real problem for idle LB because the period of smaller domains will increase with its CPUs being busy and this will let time for higher ones to pulled tasks. But this becomes a problem when all CPUs are already busy because all domains stay synced when they trigger their LB. A simple way to minimize simultaneous LB of all domains is to decrement the the busy interval by 1 jiffies. Because of the busy_factor, the interval of larger domain will not be a multiple of smaller ones anymore. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/20200921072424.14813-4-vincent.guittot@linaro.org	2020-09-25 14:23:26 +02:00
Vincent Guittot	2208cdaa56	sched/fair: Reduce minimal imbalance threshold The 25% default imbalance threshold for DIE and NUMA domain is large enough to generate significant unfairness between threads. A typical example is the case of 11 threads running on 2x4 CPUs. The imbalance of 20% between the 2 groups of 4 cores is just low enough to not trigger the load balance between the 2 groups. We will have always the same 6 threads on one group of 4 CPUs and the other 5 threads on the other group of CPUS. With a fair time sharing in each group, we ends up with +20% running time for the group of 5 threads. Consider decreasing the imbalance threshold for overloaded case where we use the load to balance task and to ensure fair time sharing. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Acked-by: Hillf Danton <hdanton@sina.com> Link: https://lkml.kernel.org/r/20200921072424.14813-3-vincent.guittot@linaro.org	2020-09-25 14:23:26 +02:00
Vincent Guittot	5a7f555904	sched/fair: Relax constraint on task's load during load balance Some UCs like 9 always running tasks on 8 CPUs can't be balanced and the load balancer currently migrates the waiting task between the CPUs in an almost random manner. The success of a rq pulling a task depends of the value of nr_balance_failed of its domains and its ability to be faster than others to detach it. This behavior results in an unfair distribution of the running time between tasks because some CPUs will run most of the time, if not always, the same task whereas others will share their time between several tasks. Instead of using nr_balance_failed as a boolean to relax the condition for detaching task, the LB will use nr_balanced_failed to relax the threshold between the tasks'load and the imbalance. This mecanism prevents the same rq or domain to always win the load balance fight. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Phil Auld <pauld@redhat.com> Link: https://lkml.kernel.org/r/20200921072424.14813-2-vincent.guittot@linaro.org	2020-09-25 14:23:25 +02:00
Xianting Tian	fe7491580d	sched/fair: Remove the force parameter of update_tg_load_avg() In the file fair.c, sometims update_tg_load_avg(cfs_rq, 0) is used, sometimes update_tg_load_avg(cfs_rq, false) is used. update_tg_load_avg() has the parameter force, but in current code, it never set 1 or true to it, so remove the force parameter. Signed-off-by: Xianting Tian <tian.xianting@h3c.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200924014755.36253-1-tian.xianting@h3c.com	2020-09-25 14:23:25 +02:00
Xunlei Pang	df3cb4ea1f	sched/fair: Fix wrong cpu selecting from isolated domain We've met problems that occasionally tasks with full cpumask (e.g. by putting it into a cpuset or setting to full affinity) were migrated to our isolated cpus in production environment. After some analysis, we found that it is due to the current select_idle_smt() not considering the sched_domain mask. Steps to reproduce on my 31-CPU hyperthreads machine: 1. with boot parameter: "isolcpus=domain,2-31" (thread lists: 0,16 and 1,17) 2. cgcreate -g cpu:test; cgexec -g cpu:test "test_threads" 3. some threads will be migrated to the isolated cpu16~17. Fix it by checking the valid domain mask in select_idle_smt(). Fixes: `10e2f1acd0` ("sched/core: Rewrite and improve select_idle_siblings()) Reported-by: Wetp Zhang <wetp.zy@linux.alibaba.com> Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Jiang Biao <benbjiang@tencent.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/1600930127-76857-1-git-send-email-xlpang@linux.alibaba.com	2020-09-25 14:23:25 +02:00
YueHaibing	51bd5121c4	sched: Remove unused inline function uclamp_bucket_base_value() There is no caller in tree, so can remove it. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lkml.kernel.org/r/20200922132410.48440-1-yuehaibing@huawei.com	2020-09-25 14:23:25 +02:00
Daniel Bristot de Oliveira	2586af1ac1	sched/rt: Disable RT_RUNTIME_SHARE by default The RT_RUNTIME_SHARE sched feature enables the sharing of rt_runtime between CPUs, allowing a CPU to run a real-time task up to 100% of the time while leaving more space for non-real-time tasks to run on the CPU that lend rt_runtime. The problem is that a CPU can easily borrow enough rt_runtime to allow a spinning rt-task to run forever, starving per-cpu tasks like kworkers, which are non-real-time by design. This patch disables RT_RUNTIME_SHARE by default, avoiding this problem. The feature will still be present for users that want to enable it, though. Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Wei Wang <wvw@google.com> Link: https://lkml.kernel.org/r/b776ab46817e3db5d8ef79175fa0d71073c051c7.1600697903.git.bristot@redhat.com	2020-09-25 14:23:24 +02:00
Lucas Stach	46fcc4b00c	sched/deadline: Fix stale throttling on de-/boosted tasks When a boosted task gets throttled, what normally happens is that it's immediately enqueued again with ENQUEUE_REPLENISH, which replenishes the runtime and clears the dl_throttled flag. There is a special case however: if the throttling happened on sched-out and the task has been deboosted in the meantime, the replenish is skipped as the task will return to its normal scheduling class. This leaves the task with the dl_throttled flag set. Now if the task gets boosted up to the deadline scheduling class again while it is sleeping, it's still in the throttled state. The normal wakeup however will enqueue the task with ENQUEUE_REPLENISH not set, so we don't actually place it on the rq. Thus we end up with a task that is runnable, but not actually on the rq and neither a immediate replenishment happens, nor is the replenishment timer set up, so the task is stuck in forever-throttled limbo. Clear the dl_throttled flag before dropping back to the normal scheduling class to fix this issue. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lkml.kernel.org/r/20200831110719.2126930-1-l.stach@pengutronix.de	2020-09-25 14:23:24 +02:00
Vincent Guittot	8e0e0eda6a	sched/numa: Use runnable_avg to classify node Use runnable_avg to classify numa node state similarly to what is done for normal load balancer. This helps to ensure that numa and normal balancers use the same view of the state of the system. Large arm64system: 2 nodes / 224 CPUs: hackbench -l (256000/#grp) -g #grp grp tip/sched/core +patchset improvement 1 14,008(+/- 4,99 %) 13,800(+/- 3.88 %) 1,48 % 4 4,340(+/- 5.35 %) 4.283(+/- 4.85 %) 1,33 % 16 3,357(+/- 0.55 %) 3.359(+/- 0.54 %) -0,06 % 32 3,050(+/- 0.94 %) 3.039(+/- 1,06 %) 0,38 % 64 2.968(+/- 1,85 %) 3.006(+/- 2.92 %) -1.27 % 128 3,290(+/-12.61 %) 3,108(+/- 5.97 %) 5.51 % 256 3.235(+/- 3.95 %) 3,188(+/- 2.83 %) 1.45 % Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mel Gorman <mgorman@suse.de> Link: https://lkml.kernel.org/r/20200921072959.16317-1-vincent.guittot@linaro.org	2020-09-25 14:23:24 +02:00
Liu Shixin	b942fc0319	RDMA/mlx5: Fix type warning of sizeof in __mlx5_ib_alloc_counters() sizeof() when applied to a pointer typed expression should give the size of the pointed data, even if the data is a pointer. Fixes: `e1f24a79f4` ("IB/mlx5: Support congestion related counters") Link: https://lore.kernel.org/r/20200917081354.2083293-1-liushixin2@huawei.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-25 09:17:42 -03:00
Krzysztof Kozlowski	197bbae9ed	arm64: dts: ti: k3-j721e-common-proc-board: align GPIO hog names with dtschema The convention for node names is to use hyphens, not underscores. dtschema for pca95xx expects GPIO hogs to end with 'hog' prefix. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Nishanth Menon <nm@ti.com> Link: https://lore.kernel.org/r/20200916155715.21009-7-krzk@kernel.org	2020-09-25 06:59:31 -05:00
Ofir Bitton	25121d9804	habanalabs/gaudi: configure QMAN LDMA registers properly LDMA registers are configured with a fixed value. We add new define set which gives the configuration a proper meaning. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-25 14:44:21 +03:00
Oded Gabbay	eab1f6e7b0	habanalabs: add notice of device not idle The device should be idle after a context is closed. If not, print a notice. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-25 14:44:21 +03:00
Oded Gabbay	3c3aa5dbd6	habanalabs: add debug messages for opening/closing context During debugging of error we sometimes need to know whether the error happened when a user context was open. Add debug prints when opening and closing user contexts. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-25 14:44:20 +03:00
Oded Gabbay	9e2e8fc7d6	habanalabs: release kernel context after hw_fini Some engines use resources that belong to the kernel context (e.g. MMU mappings). In case the halt-engines doesn't work properly due to H/W restriction, we need to make sure the kernel context lives on until after the hw_fini. The hw_fini resets the ASIC after that no engine is alive and we can safely close the kernel context. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-25 14:44:20 +03:00
Oded Gabbay	fc6121e961	habanalabs: correct an error message We don't try to allocate huge pages here so remove the huge word. Reviewed-by: Tomer Tayar <ttayar@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>	2020-09-25 14:44:20 +03:00
Krzysztof Kozlowski	5e7998b801	ARM: dts: am3874: iceboard: fix GPIO expander reset GPIOs Correct the property for reset GPIOs of the GPIO expander. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com>	2020-09-25 14:38:31 +03:00
Krzysztof Kozlowski	ccd73f07e0	ARM: dts: am335x: t335: align GPIO hog names with dtschema The convention for node names is to use hyphens, not underscores. dtschema for pca95xx expects GPIO hogs to end with 'hog' prefix. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com>	2020-09-25 14:37:30 +03:00
Krzysztof Kozlowski	97b16ed103	ARM: dts: am335x: lxm: fix PCA9539 GPIO expander properties The PCA9539 GPIO expander requires GPIO controller properties to operate properly. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Tony Lindgren <tony@atomide.com>	2020-09-25 14:36:11 +03:00
Grygorii Strashko	8cbe7afc92	ARM: dts: am437x-l4: drop legacy cpsw dt node All am437x boards have been converted to use new driver, so drop legacy cpsw dt node. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>	2020-09-25 14:31:12 +03:00
Grygorii Strashko	aff7e5038c	ARM: dts: am437x: switch to new cpsw switch drv The dual_mac mode has been preserved the same way between legacy and new driver, and one port devices works the same as 1 dual_mac port - it's safe to switch drivers. So, Switch all am437x boards to use new cpsw switch driver. Those boards have or 2 Ext. port wired and configured in dual_mac mode by default, or only 1 Ext. port. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>	2020-09-25 14:31:05 +03:00
Grygorii Strashko	7bf8f37aea	ARM: dts: am437x-l4: add dt node for new cpsw switchdev driver Add DT node for the new cpsw switchdev based driver. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com>	2020-09-25 14:30:58 +03:00
Krzysztof Kozlowski	94d4c3cffe	mmc: sdhci-s3c: hide forward declaration of of_device_id behind CONFIG_OF The struct of_device_id is not defined with !CONFIG_OF so its forward declaration should be hidden to as well. This should address clang compile warning: drivers/mmc/host/sdhci-s3c.c:464:34: warning: tentative array definition assumed to have one element Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200925072532.10272-1-krzk@kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2020-09-25 13:30:52 +02:00
Krzysztof Kozlowski	0cb231f1e0	mmc: sdhci: fix indentation mistakes Fix inconsistent indenting, reported by Smatch: drivers/mmc/host/sdhci-esdhc-imx.c:1380 sdhci_esdhc_imx_hwinit() warn: inconsistent indenting drivers/mmc/host/sdhci-sprd.c:390 sdhci_sprd_request_done() warn: inconsistent indenting Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200923153739.30327-2-krzk@kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2020-09-25 13:24:02 +02:00
Krzysztof Kozlowski	6b28f2c4da	mmc: moxart: remove unneeded check for drvdata The 'struct mmc_host *mmc' comes from drvdata set at the end of probe, so it cannot be NULL. The code already dereferences it few lines before the check with mmc_priv(). This also fixes smatch warning: drivers/mmc/host/moxart-mmc.c:692 moxart_remove() warn: variable dereferenced before check 'mmc' (see line 688) Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20200923153739.30327-1-krzk@kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2020-09-25 13:24:02 +02:00
Wolfram Sang	fbb31330f9	mmc: renesas_sdhi: drop local flag for tuning The MMC core has now a generic check if some tuning is in progress. Its protected area is a bit larger than the custom one in this driver but we concluded that this works equally well for the intended case. So, drop the local flag and switch to the generic one. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://lore.kernel.org/r/20200922172253.4458-1-wsa@kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2020-09-25 13:24:02 +02:00
Qinglang Miao	8dae6a249c	mmc: rtsx_usb_sdmmc: simplify the return expression of sd_change_phase() Simplify the return expression. Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Link: https://lore.kernel.org/r/20200921131042.92340-1-miaoqinglang@huawei.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2020-09-25 13:24:02 +02:00
Wolfram Sang	3439c588c2	mmc: core: document mmc_hw_reset() Add documentation for mmc_hw_reset to make sure the intended use case is clear. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20200918215446.65654-1-wsa+renesas@sang-engineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>	2020-09-25 13:24:02 +02:00
Jason Gerecke	d9216d753b	HID: wacom: Avoid entering wacom_wac_pen_report for pad / battery It has recently been reported that the "heartbeat" report from devices like the 2nd-gen Intuos Pro (PTH-460, PTH-660, PTH-860) or the 2nd-gen Bluetooth-enabled Intuos tablets (CTL-4100WL, CTL-6100WL) can cause the driver to send a spurious BTN_TOUCH=0 once per second in the middle of drawing. This can result in broken lines while drawing on Chrome OS. The source of the issue has been traced back to a change which modified the driver to only call `wacom_wac_pad_report()` once per report instead of once per collection. As part of this change, pad-handling code was removed from `wacom_wac_collection()` under the assumption that the `WACOM_PEN_FIELD` and `WACOM_TOUCH_FIELD` checks would not be satisfied when a pad or battery collection was being processed. To be clear, the macros `WACOM_PAD_FIELD` and `WACOM_PEN_FIELD` do not currently check exclusive conditions. In fact, most "pad" fields will also appear to be "pen" fields simply due to their presence inside of a Digitizer application collection. Because of this, the removal of the check from `wacom_wac_collection()` just causes pad / battery collections to instead trigger a call to `wacom_wac_pen_report()` instead. The pen report function in turn resets the tip switch state just prior to exiting, resulting in the observed BTN_TOUCH=0 symptom. To correct this, we restore a version of the `WACOM_PAD_FIELD` check in `wacom_wac_collection()` and return early. This effectively prevents pad / battery collections from being reported until the very end of the report as originally intended. Fixes: `d4b8efeb46` ("HID: wacom: generic: Correct pad syncing") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> Reviewed-by: Ping Cheng <ping.cheng@wacom.com> Tested-by: Ping Cheng <ping.cheng@wacom.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2020-09-25 13:14:16 +02:00
Alex Hung	b226faab4e	ACPI: video: use ACPI backlight for HP 635 Notebook The default backlight interface is AMD's radeon_bl0 which does not work on this system, so use the ACPI backlight interface on it instead. BugLink: https://bugs.launchpad.net/bugs/1894667 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Alex Hung <alex.hung@canonical.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-25 12:59:24 +02:00
Andy Shevchenko	399e08f1f0	MAINTAINERS: Use my kernel.org address for Intel PMIC work Use my kernel.org address for Intel PMIC work. While here, upgrade status to maintainer of PMIC MFD devices. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-25 12:52:33 +02:00
Hanjun Guo	32c6f3ffa0	ACPI: APD: Clean up header file include statements Make the included header files appear in the alphabetical order and remove the unnecessary header file inclusions. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-25 12:48:11 +02:00
Hanjun Guo	ee2bc5d2c4	ACPI: APD: Remove unnecessary APD_ADDR() macro stub Since APD_ADDR(desc) is only used when CONFIG_X86_AMD_PLATFORM_DEVICE or CONFIG_ARM64 is set, no need for the stub APD_ADDR(desc) definition covering the other cases, so remove it. Also update the comments for #endif. Signed-off-by: Hanjun Guo <guohanjun@huawei.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-09-25 12:48:11 +02:00

... 119 120 121 122 123 ...

966273 Commits