Commit c1f6925e10 ("mm: put readahead pages in cache earlier") causes
the read performance of squashfs to deteriorate.Through testing, we find
that the performance will be back by closing the readahead of squashfs.
So we want to learn the way of ubifs, provides backing_dev_info and
disable read-ahead
We tested the following data by fio.
squashfs image blocksize=128K
test command:
fio --name basic --bs=? --filename="/mnt/test_file" --rw=? --iodepth=1 --ioengine=psync --runtime=200 --time_based
turn on squashfs readahead in 5.10 kernel
bs(k) read/randread MB/s
4 randread 271
128 randread 231
1024 randread 246
4 read 310
128 read 245
1024 read 247
turn off squashfs readahead in 5.10 kernel
bs(k) read/randread MB/s
4 randread 293
128 randread 330
1024 randread 363
4 read 338
128 read 360
1024 read 365
turn on squashfs readahead and revert the
commit c1f6925e1091("mm: put readahead
pages in cache earlier") in 5.10 kernel
bs(k) read/randread MB/s
4 randread 289
128 randread 306
1024 randread 335
4 read 337
128 read 336
1024 read 338
Link: https://lkml.kernel.org/r/20211116113141.1391026-1-zhengliang6@huawei.com
Signed-off-by: Zheng Liang <zhengliang6@huawei.com>
Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Zhang Yi <yi.zhang@huawei.com>
Cc: Hou Tao <houtao1@huawei.com>
Cc: Miao Xie <miaoxie@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The comments for the file should not be in kernel-doc format:
/**
* attrib.c - NTFS attribute operations. Part of the Linux-NTFS
as it causes it to be incorrectly identified for function
ntfs_map_runlist_nolock(), causing some warnings found by running
scripts/kernel-doc.:
fs/ntfs/attrib.c:25: warning: Incorrect use of kernel-doc format: * ntfs_map_runlist_nolock - map (a part of) a runlist of an ntfs inode
fs/ntfs/attrib.c:71: warning: Function parameter or member 'ni' not described in 'ntfs_map_runlist_nolock'
fs/ntfs/attrib.c:71: warning: Function parameter or member 'vcn' not described in 'ntfs_map_runlist_nolock'
fs/ntfs/attrib.c:71: warning: Function parameter or member 'ctx' not described in 'ntfs_map_runlist_nolock'
fs/ntfs/attrib.c:71: warning: expecting prototype for attrib.c - NTFS attribute operations. Part of the Linux(). Prototype was for ntfs_map_runlist_nolock() instead
Link: https://lkml.kernel.org/r/20220106015145.67067-1-yang.lee@linux.alibaba.com
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Anton Altaparmakov <anton@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull input fix from Dmitry Torokhov:
"A small fixup to the Zinitix touchscreen driver to avoid enabling the
IRQ line before we successfully requested it"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: zinitix - make sure the IRQ is allocated before it gets enabled
Pull ARM SoC fix from Olof Johansson:
"One more fix for 5.16
I had missed one patch when I sent up what I thought was the last
batch of fixes for this release. This one fixes issues on the
Raspberry Pi platforms due to gpio init changes this release, so
hopefully we can get it merged before final release is cut"
* tag 'soc-fixes-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: dts: gpio-ranges property is now required
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Revert "libtraceevent: Increase libtraceevent logging when verbose",
breaks the build with libtraceevent-1.3.0, i.e. when building with
'LIBTRACEEVENT_DYNAMIC=1'.
- Avoid early exit in 'perf trace' due to running SIGCHLD handler
before it makes sense to. It can happen when using a BPF source code
event that have to be first built into an object file.
* tag 'perf-tools-fixes-for-v5.16-2022-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
Revert "libtraceevent: Increase libtraceevent logging when verbose"
perf trace: Avoid early exit due to running SIGCHLD handler before it makes sense to
Since irq request is the last thing in the driver probe, it happens
later than the input device registration. This means that there is a
small time window where if the open method is called the driver will
attempt to enable not yet available irq.
Fix that by moving the irq request before the input device registration.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Fixes: 26822652c8 ("Input: add zinitix touchscreen driver")
Signed-off-by: Nikita Travkin <nikita@trvn.ru>
Link: https://lore.kernel.org/r/20220106072840.36851-2-nikita@trvn.ru
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Pull ARM SoC fixes from Olof Johansson:
"A few more fixes have come in, nothing overly severe but would be good
to get in by final release:
- More specific compatible fields on the qspi controller for socfpga,
to enable quirks in the driver
- A runtime PM fix for Renesas to fix mismatched reference counts on
errors"
* tag 'soc-fixes-5.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: dts: socfpga: change qspi to "intel,socfpga-qspi"
dt-bindings: spi: cadence-quadspi: document "intel,socfpga-qspi"
reset: renesas: Fix Runtime PM usage
Pull i2c fixes from Wolfram Sang:
"Fix the regression with AMD GPU suspend by reverting the
handling of bus regulators in the I2C core.
Also, there is a fix for the MPC driver to prevent an
out-of-bound-access"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Revert "i2c: core: support bus regulator controlling in adapter"
i2c: mpc: Avoid out of bounds memory access
Pull power supply fixes from Sebastian Reichel:
"Three fixes for the 5.16 cycle:
- Avoid going beyond last capacity in the power-supply core
- Replace 1E6L with NSEC_PER_MSEC to avoid floating point calculation
in LLVM resulting in a build failure
- Fix ADC measurements in bq25890 charger driver"
* tag 'for-v5.16-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
power: reset: ltc2952: Fix use of floating point literals
power: bq25890: Enable continuous conversion for ADC at charging
power: supply: core: Break capacity loop
Pull xfs fix from Darrick Wong:
- Make the old ALLOCSP ioctl behave in a consistent manner with newer
syscalls like fallocate.
* tag 'xfs-5.16-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: map unwritten blocks in XFS_IOC_{ALLOC,FREE}SP just like fallocate
Pull cgroup fixes from Tejun Heo:
"This contains the cgroup.procs permission check fixes so that they use
the credentials at the time of open rather than write, which also
fixes the cgroup namespace lifetime bug"
* 'for-5.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
selftests: cgroup: Test open-time cgroup namespace usage for migration checks
selftests: cgroup: Test open-time credential usage for migration checks
selftests: cgroup: Make cg_create() use 0755 for permission instead of 0644
cgroup: Use open-time cgroup namespace for process migration perm checks
cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
cgroup: Use open-time credentials for process migraton perm checks
Pull EDAC fix from Tony Luck:
"Fix 10nm EDAC driver to release and unmap resources on systems without
HBM"
* tag 'edac_urgent_for_v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/i10nm: Release mdev/mbase when failing to detect HBM
This reverts commit 08efcb4a63.
This breaks the build as it will prefer using libbpf-devel header files,
even when not using LIBBPF_DYNAMIC=1, breaking the build.
This was detected on OpenSuSE Tumbleweed with libtraceevent-devel 1.3.0,
as described by Jiri Slaby:
=======================================================================
It breaks build with LIBTRACEEVENT_DYNAMIC and version 1.3.0:
> util/debug.c: In function ‘perf_debug_option’:
> util/debug.c:243:17: error: implicit declaration of function
‘tep_set_loglevel’ [-Werror=implicit-function-declaration]
> 243 | tep_set_loglevel(TEP_LOG_INFO);
> | ^~~~~~~~~~~~~~~~
> util/debug.c:243:34: error: ‘TEP_LOG_INFO’ undeclared (first use in this
function); did you mean ‘TEP_PRINT_INFO’?
> 243 | tep_set_loglevel(TEP_LOG_INFO);
> | ^~~~~~~~~~~~
> | TEP_PRINT_INFO
> util/debug.c:243:34: note: each undeclared identifier is reported only once
for each function it appears in
> util/debug.c:245:34: error: ‘TEP_LOG_DEBUG’ undeclared (first use in this
function)
> 245 | tep_set_loglevel(TEP_LOG_DEBUG);
> | ^~~~~~~~~~~~~
> util/debug.c:247:34: error: ‘TEP_LOG_ALL’ undeclared (first use in this
function)
> 247 | tep_set_loglevel(TEP_LOG_ALL);
> | ^~~~~~~~~~~
It is because the gcc's command line looks like:
gcc
...
-I/home/abuild/rpmbuild/BUILD/tools/lib/
...
-DLIBTRACEEVENT_VERSION=65790
...
=======================================================================
The proper way to fix this is more involved and so not suitable for this
late in the 5.16-rc stage.
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/lkml/bc2b0786-8965-1bcd-2316-9d9bb37b9c31@kernel.org
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/lkml/YddGjjmlMZzxUZbN@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull kvm fixes from Paolo Bonzini:
"Two small fixes for x86:
- lockdep WARN due to missing lock nesting annotation
- NULL pointer dereference when accessing debugfs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Check for rmaps allocation
KVM: SEV: Mark nested locking of kvm->lock
Pull drm fixes from Dave Airlie:
"There is only the amdgpu runtime pm regression fix in here:
amdgpu:
- suspend/resume fix
- fix runtime PM regression"
* tag 'drm-fixes-2022-01-07' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: disable runpm if we are the primary adapter
fbdev: fbmem: add a helper to determine if an aperture is used by a fw fb
drm/amd/pm: keep the BACO feature enabled for suspend
Pull rdma fixes from Jason Gunthorpe:
"Last pull for 5.16, the reversion has been known for a while now but
didn't get a proper fix in time. Looks like we will have several
info-leak bugs to take care of going foward.
- Revert the patch fixing the DM related crash causing a widespread
regression for kernel ULPs. A proper fix just didn't appear this
cycle due to the holidays
- Missing NULL check on alloc in uverbs
- Double free in rxe error paths
- Fix a new kernel-infoleak report when forming ah_attr's without
GRH's in ucma"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/core: Don't infoleak GRH fields
RDMA/uverbs: Check for null return of kmalloc_array
Revert "RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow"
RDMA/rxe: Prevent double freeing rxe_map_set()
Pull tracing fixes from Steven Rostedt:
"Three minor tracing fixes:
- Fix missing prototypes in sample module for direct functions
- Fix check of valid buffer in get_trace_buf()
- Fix annotations of percpu pointers"
* tag 'trace-v5.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Tag trace_percpu_buffer as a percpu pointer
tracing: Fix check for trace_percpu_buffer validity in get_trace_buf()
ftrace/samples: Add missing prototypes direct functions
When a task is writing to an fd opened by a different task, the perm check
should use the cgroup namespace of the latter task. Add a test for it.
Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
When a task is writing to an fd opened by a different task, the perm check
should use the credentials of the latter task. Add a test for it.
Tested-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
0644 is an odd perm to create a cgroup which is a directory. Use the regular
0755 instead. This is necessary for euid switching test case.
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's cgroup namespace which is
a potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.
This patch makes cgroup remember the cgroup namespace at the time of open
and uses it for migration permission checks instad of current's. Note that
this only applies to cgroup2 as cgroup1 doesn't have namespace support.
This also fixes a use-after-free bug on cgroupns reported in
https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com
Note that backporting this fix also requires the preceding patch.
Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Reported-by: syzbot+50f5cf33a284ce738b62@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/00000000000048c15c05d0083397@google.com
Fixes: 5136f6365c ("cgroup: implement "nsdelegate" mount option")
Signed-off-by: Tejun Heo <tj@kernel.org>
of->priv is currently used by each interface file implementation to store
private information. This patch collects the current two private data usages
into struct cgroup_file_ctx which is allocated and freed by the common path.
This allows generic private data which applies to multiple files, which will
be used to in the following patch.
Note that cgroup_procs iterator is now embedded as procs.iter in the new
cgroup_file_ctx so that it doesn't need to be allocated and freed
separately.
v2: union dropped from cgroup_file_ctx and the procs iterator is embedded in
cgroup_file_ctx as suggested by Linus.
v3: Michal pointed out that cgroup1's procs pidlist uses of->priv too.
Converted. Didn't change to embedded allocation as cgroup1 pidlists get
stored for caching.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's credentials which is a
potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.
This patch makes both cgroup2 and cgroup1 process migration interfaces to
use the credentials saved at the time of open (file->f_cred) instead of
current's.
Reported-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Fixes: 187fe84067 ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy")
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>