When xfs_defer_capture extracts the deferred ops and transaction state
from a transaction, it should record the remaining block reservations so
that when we continue the dfops chain, we can reserve the same number of
blocks to use. We capture the reservations for both data and realtime
volumes.
This adds the requirement that every log intent item recovery function
must be careful to reserve enough blocks to handle both itself and all
defer ops that it can queue. On the other hand, this enables us to do
away with the handwaving block estimation nonsense that was going on in
xlog_finish_defer_ops.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
When we replay unfinished intent items that have been recovered from the
log, it's possible that the replay will cause the creation of more
deferred work items. As outlined in commit 509955823c ("xfs: log
recovery should replay deferred ops in order"), later work items have an
implicit ordering dependency on earlier work items. Therefore, recovery
must replay the items (both recovered and created) in the same order
that they would have been during normal operation.
For log recovery, we enforce this ordering by using an empty transaction
to collect deferred ops that get created in the process of recovering a
log intent item to prevent them from being committed before the rest of
the recovered intent items. After we finish committing all the
recovered log items, we allocate a transaction with an enormous block
reservation, splice our huge list of created deferred ops into that
transaction, and commit it, thereby finishing all those ops.
This is /really/ hokey -- it's the one place in XFS where we allow
nested transactions; the splicing of the defer ops list is is inelegant
and has to be done twice per recovery function; and the broken way we
handle inode pointers and block reservations cause subtle use-after-free
and allocator problems that will be fixed by this patch and the two
patches after it.
Therefore, replace the hokey empty transaction with a structure designed
to capture each chain of deferred ops that are created as part of
recovering a single unfinished log intent. Finally, refactor the loop
that replays those chains to do so using one transaction per chain.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The ->iop_recover method of a log intent item removes the recovered
intent item from the AIL by logging an intent done item and committing
the transaction, so it's superfluous to have this flag check. Nothing
else uses it, so get rid of the flag entirely.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Remove this one-line helper since the assert is trivially true in one
call site and the rest obscures a bitmask operation.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
The ASUS D700SA desktop's audio (1043:2390) with ALC887 cannot detect
the headset microphone and another headphone jack until
ALC887_FIXUP_ASUS_HMIC and ALC887_FIXUP_ASUS_AUDIO quirks are applied.
The NID 0x15 maps as the headset microphone and NID 0x19 maps as another
headphone jack. Also need the function like alc887_fixup_asus_jack to
enable the audio jacks.
Signed-off-by: Jian-Hong Pan <jhp@endlessos.org>
Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20201007052224.22611-1-jhp@endlessos.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
There is nothing to prevent the CPU or the compiler from reordering
the writes to stats->reset_time and stats->reset_pending in
store_reset(), in which case the readers of stats->reset_time may see
a stale value. Moreover, on 32-bit arches the write to reset_time
cannot be completed in one go, so the readers of it may see a
partially updated value in that case.
To prevent that from happening, add a write memory barrier between
the writes to stats->reset_time and stats->reset_pending in
store_reset() and corresponding read memory barrier in the
readers of stats->reset_time.
Fixes: 40c3bd4cfa ("cpufreq: stats: Defer stats update to cpufreq_stats_record_transition()")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Drop a redundant local variable definition from sugov_fast_switch()
and rearrange the code in there to avoid the redundant logical
negation.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Pull NVMe fix from Christoph:
"nvme fix for 5.9:
- fix a recently introduced controller leak (Logan Gunthorpe)"
* tag 'nvme-5.9-2020-10-07' of git://git.infradead.org/nvme:
nvme-core: put ctrl ref when module ref get fail
HP Spectre notebooks (and probably other model as well)
support up to 4 thermal policy:
- HP Recommended
- Performance
- Cool
- Quiet
at least on HP Spectre x360 Convertible 15-df0xxx the firmware sets the
thermal policy to default but hardcode the odvp0 variable to 1, this causes
thermald to choose the wrong DPTF profile witch result in low performance
when notebook is on AC, calling thermal policy write command allow firmware
to correctly set the odvp0 variable.
Signed-off-by: Elia Devito <eliadevito@gmail.com>
Link: https://lore.kernel.org/r/20201004211305.11628-1-eliadevito@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Currently with run_kselftest.sh there is no way to choose which test
we could run. All the tests listed in kselftest-list.txt are all run
every time. This patch enhanced the run_kselftest.sh to make the test
collections (or tests) individually selectable. e.g.:
$ ./run_kselftest.sh -c seccomp -t timers:posix_timers -t timers:nanosleep
Additionally adds a way to list all known tests with "-l", usage
with "-h", and perform a dry run without running tests with "-n".
Co-developed-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Instead of building a script on the fly (which just repeats the same
thing for each test collection), move the script out of the Makefile and
into run_kselftest.sh, which reads kselftest-list.txt.
Adjust the emit_tests target to report each test on a separate line so
that test running tools (e.g. LAVA) can easily remove individual
tests (for example, as seen in [1]).
[1] 2e7b62155e
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
pr_*() printing helpers are preferred over using bare printk().
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The copyright notice alarms checkpatch.pl of usin spaces before tabs.
Fix this.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Move local variables of the same type into a single line for better
readability.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The structure containing the storeme field is allocated using kzalloc().
There's no need to set it to 0 again.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
simple_strtoul() is deprecated. Use kstrtoint().
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Aling the assignment of a static structure's field to be consistent with
all other instances.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Checking pointers for NULL value before passing them to container_of()
is pointless because even if we return NULL from the ternary operator,
none of the users checks the returned value but they instead dereference
it unconditionally. AFAICT this cannot really happen either. Simplify
the code by removing the ternary operators from to_childless() et al.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
There's no need for suplemental newlines in the source file - especially
since the examples are well divided with comments already.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Code samples for configfs don't have an explicit maintainer. Add the
samples directory to the existing configfs entry in MAINTAINERS.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Late patches for 5.10: MTE selftests, minor KCSAN preparation and removal
of some unused prototypes.
(Amit Daniel Kachhap and others)
* for-next/late-arrivals:
arm64: random: Remove no longer needed prototypes
arm64: initialize per-cpu offsets earlier
kselftest/arm64: Check mte tagged user address in kernel
kselftest/arm64: Verify KSM page merge for MTE pages
kselftest/arm64: Verify all different mmap MTE options
kselftest/arm64: Check forked child mte memory accessibility
kselftest/arm64: Verify mte tag inclusion via prctl
kselftest/arm64: Add utilities and a test to validate mte memory
Commit 9bceb80b3c ("arm64: kaslr: Use standard early random
function") removed the direct calls of the __arm64_rndr() and
__early_cpu_has_rndr() functions, but left the dummy prototypes in the
#else branch of the #ifdef CONFIG_ARCH_RANDOM guard.
Remove the redundant prototypes, as they have no users outside of
this header file.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20201006194453.36519-1-andre.przywara@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
1. If we xa_cmpxchg() an entry in, it marks the index as not free.
2. If we xa_cmpxchg() NULL in, it marks the index as free.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
If a bitmap needs to be allocated, and then by the time the thread
is scheduled to be run again all the indices which would satisfy the
allocation have been allocated then we would leak the allocation. Almost
impossible to hit in practice, but a trivial fix. Found by Coverity.
Fixes: f32f004cdd ("ida: Convert to XArray")
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
It was reported that 'perf stat' crashed when using with armv8_pmu (CPU)
events with the task mode. As 'perf stat' uses an empty cpu map for
task mode but armv8_pmu has its own cpu mask, it has confused which map
it should use when accessing file descriptors and this causes segfaults:
(gdb) bt
#0 0x0000000000603fc8 in perf_evsel__close_fd_cpu (evsel=<optimized out>,
cpu=<optimized out>) at evsel.c:122
#1 perf_evsel__close_cpu (evsel=evsel@entry=0x716e950, cpu=7) at evsel.c:156
#2 0x00000000004d4718 in evlist__close (evlist=0x70a7cb0) at util/evlist.c:1242
#3 0x0000000000453404 in __run_perf_stat (argc=3, argc@entry=1, argv=0x30,
argv@entry=0xfffffaea2f90, run_idx=119, run_idx@entry=1701998435)
at builtin-stat.c:929
#4 0x0000000000455058 in run_perf_stat (run_idx=1701998435, argv=0xfffffaea2f90,
argc=1) at builtin-stat.c:947
#5 cmd_stat (argc=1, argv=0xfffffaea2f90) at builtin-stat.c:2357
#6 0x00000000004bb888 in run_builtin (p=p@entry=0x9764b8 <commands+288>,
argc=argc@entry=4, argv=argv@entry=0xfffffaea2f90) at perf.c:312
#7 0x00000000004bbb54 in handle_internal_command (argc=argc@entry=4,
argv=argv@entry=0xfffffaea2f90) at perf.c:364
#8 0x0000000000435378 in run_argv (argcp=<synthetic pointer>,
argv=<synthetic pointer>) at perf.c:408
#9 main (argc=4, argv=0xfffffaea2f90) at perf.c:538
To fix this, I simply used the given cpu map unless the evsel actually
is not a system-wide event (like uncore events).
Fixes: 7736627b86 ("perf stat: Use affinity for closing file descriptors")
Reported-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20201007081311.1831003-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After some unsuccessful attempts to use sysrq over console, figured
out that port->has_sysrq should likely be enabled, as per other
architectures, this when CONFIG_SERIAL_MCF_CONSOLE is also enabled.
Tested some magic sysrq commands (h, p, t, b), they works now
properly. Commands works inside 5 secs after BREAK is sent, as
expected.
Signed-off-by: Angelo Dureghello <angelo.dureghello@timesys.com>
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Patch here adds a cpumask attr to hv_gpci pmu along with ABI documentation.
Primary use to expose the cpumask is for the perf tool which has the
capability to parse the driver sysfs folder and understand the
cpumask file. Having cpumask file will reduce the number of perf command
line parameters (will avoid "-C" option in the perf tool
command line). It can also notify the user which is
the current cpu used to retrieve the counter data.
command:# cat /sys/devices/hv_gpci/cpumask
0
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201003074943.338618-5-kjain@linux.ibm.com
Patch here adds cpu hotplug functions to hv_gpci pmu.
A new cpuhp_state "CPUHP_AP_PERF_POWERPC_HV_GPCI_ONLINE" enum
is added.
The online callback function updates the cpumask only if its
empty. As the primary intention of adding hotplug support
is to designate a CPU to make HCALL to collect the
counter data.
The offline function test and clear corresponding cpu in a cpumask
and update cpumask to any other active cpu.
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201003074943.338618-4-kjain@linux.ibm.com
Commit 9e9f601084 ("powerpc/perf/{hv-gpci, hv-common}: generate
requests with counters annotated") adds a framework for defining
gpci counters.
In this patch, they adds starting_index value as '0xffffffffffffffff'.
which is wrong as starting_index is of size 32 bits.
Because of this, incase we try to run hv-gpci event we get error.
In power9 machine:
command#: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
-C 0 -I 1000
event syntax error: '..bie_count_and_time_tlbie_instructions_issued/'
\___ value too big for format, maximum is 4294967295
This patch fix this issue and changes starting_index value to '0xffffffff'
After this patch:
command#: perf stat -e hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/ -C 0 -I 1000
1.000085786 1,024 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
2.000287818 1,024 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
2.439113909 17,408 hv_gpci/system_tlbie_count_and_time_tlbie_instructions_issued/
Fixes: 9e9f601084 ("powerpc/perf/{hv-gpci, hv-common}: generate requests with counters annotated")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201003074943.338618-1-kjain@linux.ibm.com
If the RTAS call to query the PE address for a device fails we jump the
err: label where an error message is printed along with the return code.
However, the printed return code is from the "ret" variable which isn't set
at that point since we assigned the result to "addr" instead. Fix this by
consistently using the "ret" variable for the result of the RTAS call
helpers an dropping the "addr" local variable"
Fixes: 98ba956f6a ("powerpc/pseries/eeh: Rework device EEH PE determination")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201007040903.819081-2-oohall@gmail.com
The device can be connected on SPI or on SDIO. The original file
described the two options separately. So, most of the file had to be
rewritten in order to match with the Yaml requirements.
Some device requirements are still written in the comments since they
cannot been expressed with the current scheme (e.g. reg must be set to 1
with SDIO, interrupt is mandatory with SPI, reset-gpio in SPI is
replaced by mmc-pwrseq in SDIO, etc...).
The examples provided have also been reworked in order to make
dt_binding_check happy.
Finally, also fix typo in the name of the file (siliabs instead of
silabs)
Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20201007101943.749898-7-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The device is in charge of respecting the QoS constraints. The driver
have to ensure that all the queues contain data and the device choose
the right queue to send.
The things starts to be more difficult when the bandwidth of the bus is
lower than the bandwidth of the WiFi. The device quickly sends the
frames of the highest priority queue. Then, it starts to send frames
from a lower priority queue. Though, there are still some high priority
frames waiting in the driver.
To work around this problem, this patch add some priorities to each
queue. The weigh of the queue was (roughly) calculated experimentally by
checking the speed ratio of each queue when the bus does not limit the
traffic:
- Be/Bk -> 20Mbps/10Mbps
- Vi/Be -> 36Mbps/180Kbps
- Vo/Be -> 35Mbps/600Kbps
- Vi/Vo -> 24Mbps/12Mbps
So, if we fix the weigh of the Background to 1, the weight of Best
Effort should be 2. The weight of Video should be 116. However, since
there is only 32 queues, it make no sense to use a value greater than
64[1]. And finally, the weight of the Voice is set to 128.
[1] Because of this approximation, with very slow bus, we can still
observe frame starvation when we measure the speed ratio of Vi/Be. It is
around 35Mbps/1Mbps (instead of 36Mbps/180Kbps). However, it is still in
accepted error range.
Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20201007101943.749898-5-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Firmwares with API < 3.6 do not forward DELBA requests. Thus, when a
Block Ack session is restarted, the reordering buffer is not flushed and
the received sequence number is not contiguous. Therefore, mac80211
starts to wait some missing frames that it will never receive.
This patch disables the reordering buffer for old firmware. It is
harmless when the network is unencrypted. When the network is encrypted,
the non-contiguous frames will be thrown away by the TKIP/CCMP replay
protection. So, the user will observe some packet loss with UDP and
performance drop with TCP.
Fixes: e5da5fbd77 ("staging: wfx: fix CCMP/TKIP replay protection")
Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20201007101943.749898-4-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As expected, when the device detect a MMIC error, it returns a specific
status. However, it also strip IV from the frame (don't ask me why).
So, with the current code, mac80211 detects a corrupted frame and it
drops it before it handle the MMIC error. The expected behavior would be
to detect MMIC error then to renegotiate the EAP session.
So, this patch correctly informs mac80211 that IV is not available. So,
mac80211 correctly takes into account the MMIC error.
Signed-off-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Link: https://lore.kernel.org/r/20201007101943.749898-2-Jerome.Pouiller@silabs.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 8d875f95da ("btrfs: disable strict file flushes for
renames and truncates") eliminated the notion of ordered operations and
instead BTRFS_INODE_ORDERED_DATA_CLOSE only remained as a flag
indicating that a file's content should be synced to disk in case a
file is truncated and any writes happen to it concurrently. In fact
this intendend behavior was broken until it was fixed in
f6dc45c7a9 ("Btrfs: fix filemap_flush call in btrfs_file_release").
All things considered let's give the flag a more descriptive name. Also
slightly reword comments.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch fixes the following sparse errors in
fs/btrfs/super.c in function btrfs_show_devname()
fs/btrfs/super.c: error: incompatible types in comparison expression (different address spaces):
fs/btrfs/super.c: struct rcu_string [noderef] <asn:4> *
fs/btrfs/super.c: struct rcu_string *
The error was because of the following line in function btrfs_show_devname():
if (first_dev)
seq_escape(m, rcu_str_deref(first_dev->name), " \t\n\\");
Annotating the btrfs_device::name member with __rcu fixes the sparse
error.
Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik04@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Many things can happen after the device is scanned and before the device
is mounted. One such thing is losing the BTRFS_MAGIC on the device.
If it happens we still won't free that device from the memory and cause
the userland confusion.
For example: As the BTRFS_IOC_DEV_INFO still carries the device path
which does not have the BTRFS_MAGIC, 'btrfs fi show' still lists
device which does not belong to the filesystem anymore:
$ mkfs.btrfs -fq -draid1 -mraid1 /dev/sda /dev/sdb
$ wipefs -a /dev/sdb
# /dev/sdb does not contain magic signature
$ mount -o degraded /dev/sda /btrfs
$ btrfs fi show -m
Label: none uuid: 470ec6fb-646b-4464-b3cb-df1b26c527bd
Total devices 2 FS bytes used 128.00KiB
devid 1 size 3.00GiB used 571.19MiB path /dev/sda
devid 2 size 3.00GiB used 571.19MiB path /dev/sdb
We need to distinguish the missing signature and invalid superblock, so
add a specific error code ENODATA for that. This also fixes failure of
fstest btrfs/198.
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>