The slub_debug is able to fix the corrupted slab freelist/page.
However, alloc_debug_processing() only checks the validity of current
and next freepointer during allocation path. As a result, once some
objects have their freepointers corrupted, deactivate_slab() may lead to
page fault.
Below is from a test kernel module when 'slub_debug=PUF,kmalloc-128
slub_nomerge'. The test kernel corrupts the freepointer of one free
object on purpose. Unfortunately, deactivate_slab() does not detect it
when iterating the freechain.
BUG: unable to handle page fault for address: 00000000123456f8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
... ...
RIP: 0010:deactivate_slab.isra.92+0xed/0x490
... ...
Call Trace:
___slab_alloc+0x536/0x570
__slab_alloc+0x17/0x30
__kmalloc+0x1d9/0x200
ext4_htree_store_dirent+0x30/0xf0
htree_dirblock_to_tree+0xcb/0x1c0
ext4_htree_fill_tree+0x1bc/0x2d0
ext4_readdir+0x54f/0x920
iterate_dir+0x88/0x190
__x64_sys_getdents+0xa6/0x140
do_syscall_64+0x49/0x170
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Therefore, this patch adds extra consistency check in deactivate_slab().
Once an object's freepointer is corrupted, all following objects
starting at this object are isolated.
[akpm@linux-foundation.org: fix build with CONFIG_SLAB_DEBUG=n]
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Jin <joe.jin@oracle.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Link: http://lkml.kernel.org/r/20200331031450.12182-1-dongli.zhang@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When syncing out a block device (a'la __sync_blockdev), any error
encountered will only be recorded in the bd_inode's mapping. When the
blockdev contains a filesystem however, we'd like to also record the
error in the super_block that's stored there.
Make mark_buffer_write_io_error also record the error in the
corresponding super_block when a writeback error occurs and the block
device contains a mounted superblock.
Since superblocks are RCU freed, hold the rcu_read_lock to ensure that
the superblock doesn't go away while we're marking it.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andres Freund <andres@anarazel.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Link: http://lkml.kernel.org/r/20200428135155.19223-3-jlayton@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "vfs: have syncfs() return error when there are writeback
errors", v6.
Currently, syncfs does not return errors when one of the inodes fails to
be written back. It will return errors based on the legacy AS_EIO and
AS_ENOSPC flags when syncing out the block device fails, but that's not
particularly helpful for filesystems that aren't backed by a blockdev.
It's also possible for a stray sync to lose those errors.
The basic idea in this set is to track writeback errors at the
superblock level, so that we can quickly and easily check whether
something bad happened without having to fsync each file individually.
syncfs is then changed to reliably report writeback errors after they
occur, much in the same fashion as fsync does now.
This patch (of 2):
Usually we suggest that applications call fsync when they want to ensure
that all data written to the file has made it to the backing store, but
that can be inefficient when there are a lot of open files.
Calling syncfs on the filesystem can be more efficient in some
situations, but the error reporting doesn't currently work the way most
people expect. If a single inode on a filesystem reports a writeback
error, syncfs won't necessarily return an error. syncfs only returns an
error if __sync_blockdev fails, and on some filesystems that's a no-op.
It would be better if syncfs reported an error if there were any
writeback failures. Then applications could call syncfs to see if there
are any errors on any open files, and could then call fsync on all of
the other descriptors to figure out which one failed.
This patch adds a new errseq_t to struct super_block, and has
mapping_set_error also record writeback errors there.
To report those errors, we also need to keep an errseq_t in struct file
to act as a cursor. This patch adds a dedicated field for that purpose,
which slots nicely into 4 bytes of padding at the end of struct file on
x86_64.
An earlier version of this patch used an O_PATH file descriptor to cue
the kernel that the open file should track the superblock error and not
the inode's writeback error.
I think that API is just too weird though. This is simpler and should
make syncfs error reporting "just work" even if someone is multiplexing
fsync and syncfs on the same fds.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Andres Freund <andres@anarazel.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/20200428135155.19223-1-jlayton@kernel.org
Link: http://lkml.kernel.org/r/20200428135155.19223-2-jlayton@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
parisc's set_pte_at() macro has set-but-not-used variable:
include/linux/pgtable.h: In function 'pte_clear_not_present_full':
arch/parisc/include/asm/pgtable.h:96:9: warning: variable 'old_pte' set but not used [-Wunused-but-set-variable]
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Usually we create and use a ocfs2 shared volume on the top of ha stack.
For pcmk based ha stack, which includes DLM, corosync and pacemaker
services.
The customers complained they could not mount existent ocfs2 volume in
the single node without ha stack, e.g. single node backup/restore
scenario.
Like this case, the customers just want to access the data from the
existent ocfs2 volume quickly, but do not want to restart or setup ha
stack.
Then, I'd like to add a mount option "nocluster", if the users use this
option to mount a ocfs2 shared volume, the whole mount will not depend
on the ha related services. the command will mount the existent ocfs2
volume directly (like local mount), for avoiding setup the ha stack.
Signed-off-by: Gang He <ghe@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Jun Piao <piaojun@huawei.com>
Link: http://lkml.kernel.org/r/20200423053300.22661-1-ghe@suse.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Before this patch, a simple typo accidentally added \n to the jid=
string for lock_nolock mounts. This made it impossible to mount a
gfs2 file system with a journal other than journal0. Thus:
mount -tgfs2 -o hostdata="jid=1" <device> <mount pt>
Resulted in:
mount: wrong fs type, bad option, bad superblock on <device>
In most cases this is not a problem. However, for debugging and
testing purposes we sometimes want to test the integrity of other
journals. This patch removes the unnecessary \n and thus allows
lock_nolock users to specify an alternate journal.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Before for this patch, function inode_go_sync ignored io errors
during inode_go_sync, overwriting them with metadata write errors:
error = filemap_fdatawait(mapping);
mapping_set_error(mapping, error);
}
error = filemap_fdatawait(metamapping);
...
return error;
So any errors returned by the inode write would be forgotten if the
metadata write succeeded. This patch still does both writes, but
only sets error if it's still zero. That way, any errors will be
reported by to the caller, do_xmote, which will take appropriate
action and report the error.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
- Add a SPDX header;
- Adjust document and section titles;
- Some whitespace fixes and new line breaks;
- Add table markups;
- Use notes markups;
- Add it to filesystems/index.rst.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Doing a "get_user_pages()" on a copy-on-write page for reading can be
ambiguous: the page can be COW'ed at any time afterwards, and the
direction of a COW event isn't defined.
Yes, whoever writes to it will generally do the COW, but if the thread
that did the get_user_pages() unmapped the page before the write (and
that could happen due to memory pressure in addition to any outright
action), the writer could also just take over the old page instead.
End result: the get_user_pages() call might result in a page pointer
that is no longer associated with the original VM, and is associated
with - and controlled by - another VM having taken it over instead.
So when doing a get_user_pages() on a COW mapping, the only really safe
thing to do would be to break the COW when getting the page, even when
only getting it for reading.
At the same time, some users simply don't even care.
For example, the perf code wants to look up the page not because it
cares about the page, but because the code simply wants to look up the
physical address of the access for informational purposes, and doesn't
really care about races when a page might be unmapped and remapped
elsewhere.
This adds logic to force a COW event by setting FOLL_WRITE on any
copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page
pointer as a result.
The current semantics end up being:
- __get_user_pages_fast(): no change. If you don't ask for a write,
you won't break COW. You'd better know what you're doing.
- get_user_pages_fast(): the fast-case "look it up in the page tables
without anything getting mmap_sem" now refuses to follow a read-only
page, since it might need COW breaking. Which happens in the slow
path - the fast path doesn't know if the memory might be COW or not.
- get_user_pages() (including the slow-path fallback for gup_fast()):
for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with
very similar semantics to FOLL_FORCE.
If it turns out that we want finer granularity (ie "only break COW when
it might actually matter" - things like the zero page are special and
don't need to be broken) we might need to push these semantics deeper
into the lookup fault path. So if people care enough, it's possible
that we might end up adding a new internal FOLL_BREAK_COW flag to go
with the internal FOLL_COW flag we already have for tracking "I had a
COW".
Alternatively, if it turns out that different callers might want to
explicitly control the forced COW break behavior, we might even want to
make such a flag visible to the users of get_user_pages() instead of
using the above default semantics.
But for now, this is mostly commentary on the issue (this commit message
being a lot bigger than the patch, and that patch in turn is almost all
comments), with that minimal "enable COW breaking early" logic using the
existing FOLL_WRITE behavior.
[ It might be worth noting that we've always had this ambiguity, and it
could arguably be seen as a user-space issue.
You only get private COW mappings that could break either way in
situations where user space is doing cooperative things (ie fork()
before an execve() etc), but it _is_ surprising and very subtle, and
fork() is supposed to give you independent address spaces.
So let's treat this as a kernel issue and make the semantics of
get_user_pages() easier to understand. Note that obviously a true
shared mapping will still get a page that can change under us, so this
does _not_ mean that get_user_pages() somehow returns any "stable"
page ]
Reported-by: Jann Horn <jannh@google.com>
Tested-by: Christoph Hellwig <hch@lst.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill Shutemov <kirill@shutemov.name>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All Intel platforms guarantee that all root complex implementations must
send transactions up to IOMMU for address translations. Hence for Intel
RCiEP devices, we can assume some ACS-type isolation even without an ACS
capability.
From the Intel VT-d spec, r3.1, sec 3.16 ("Root-Complex Peer to Peer
Considerations"):
When DMA remapping is enabled, peer-to-peer requests through the
Root-Complex must be handled as follows:
- The input address in the request is translated (through first-level,
second-level or nested translation) to a host physical address (HPA).
The address decoding for peer addresses must be done only on the
translated HPA. Hardware implementations are free to further limit
peer-to-peer accesses to specific host physical address regions (or
to completely disallow peer-forwarding of translated requests).
- Since address translation changes the contents (address field) of
the PCI Express Transaction Layer Packet (TLP), for PCI Express
peer-to-peer requests with ECRC, the Root-Complex hardware must use
the new ECRC (re-computed with the translated address) if it
decides to forward the TLP as a peer request.
- Root-ports, and multi-function root-complex integrated endpoints, may
support additional peer-to-peer control features by supporting PCI
Express Access Control Services (ACS) capability. Refer to ACS
capability in PCI Express specifications for details.
Since Linux didn't give special treatment to allow this exception, certain
RCiEP MFD devices were grouped in a single IOMMU group. This doesn't permit
a single device to be assigned to a guest for instance.
In one vendor system: Device 14.x were grouped in a single IOMMU group.
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.2
/sys/kernel/iommu_groups/5/devices/0000:00:14.3
After this patch:
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.2
/sys/kernel/iommu_groups/6/devices/0000:00:14.3 <<< new group
14.0 and 14.2 are integrated devices, but legacy end points, whereas 14.3
was a PCIe-compliant RCiEP.
00:14.3 Network controller: Intel Corporation Device 9df0 (rev 30)
Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00
This permits assigning this device to a guest VM.
[bhelgaas: drop "Fixes" tag since this doesn't fix a bug in that commit]
Link: https://lore.kernel.org/r/1590699462-7131-1-git-send-email-ashok.raj@intel.com
Tested-by: Darrel Goeddel <dgoeddel@forcepoint.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Mark Scott <mscott@forcepoint.com>,
Cc: Romil Sharma <rsharma@forcepoint.com>
There is one more regression introduced by the last build fix:
arch/arm/mach-omap2/timer.c:170:6: error: attribute declaration must precede definition [-Werror,-Wignored-attributes]
void __init omap5_realtime_timer_init(void)
^
arch/arm/mach-omap2/common.h:118:20: note: previous definition is here
static inline void omap5_realtime_timer_init(void)
^
arch/arm/mach-omap2/timer.c:170:13: error: redefinition of 'omap5_realtime_timer_init'
void __init omap5_realtime_timer_init(void)
^
arch/arm/mach-omap2/common.h:118:20: note: previous definition is here
static inline void omap5_realtime_timer_init(void)
Address this by removing the now obsolete #ifdefs in that file and
just building the entire file based on the flag that controls the
omap5_realtime_timer_init function declaration.
Link: https://lore.kernel.org/r/20200529201701.521933-1-arnd@arndb.de
Fixes: d86ad463d6 ("ARM: OMAP2+: Fix regression for using local timer on non-SMP SoCs")
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
ARM: Keystone DTS updates for 5.7
Add display support for K2G EVM Board
* tag 'keystone_dts_for_5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
ARM: dts: keystone-k2g-evm: add HDMI video support
ARM: dts: keystone-k2g: Add DSS node
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fix config file to require CONFIG_TEST_SYSCTL=m instead of y
because this driver introduces a test sysctl interfaces which
are normally not used, and only used for the selftest.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Fix to load test_sysctl.ko module correctly.
sysctl.sh checks whether the test module is embedded (or loaded
already) or not at first, and if not, it returns skip error
instead of trying modprobe. Thus, there is no chance to load the
test_sysctl test module.
Instead, this removes that module embedded check and returns
skip error only if it ensures that there is no embedded test
module *and* no loadable test module.
This also avoid referring config file since that is not
installed.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
We can currently sometimes get "RXS timed out" errors and "EOT timed out"
errors with spi transfers.
These errors can be made easy to reproduce by reading the cpcap iio
values in a loop while keeping the CPUs busy by also reading /dev/urandom.
The "RXS timed out" errors we can fix by adding spi-cpol and spi-cpha
in addition to the spi-cs-high property we already have.
The "EOT timed out" errors we can fix by increasing the spi clock rate
to 9.6 MHz. Looks similar MC13783 PMIC says it works at spi clock rates
up to 20 MHz, so let's assume we can pick any rate up to 20 MHz also
for cpcap.
Cc: maemo-leste@lists.dyne.org
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
test_sysctl.c is expected to be used as a module, but since
it does not use module_init(), it never be registered as
a module and not appeared under /sys/module/.
In the result, the selftests/sysctl/sysctl.sh always fails
to find the test module and is skipped.
This makes test_sysctl.c initialized as a module by module_init()
and allow sysctl.sh to find the test module is loaded.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Make prime number generator independently selectable from
kconfig. This allows us to enable CONFIG_PRIME_NUMBERS=m
and run the tools/testing/selftests/lib/prime_numbers.sh
without other DRM selftest modules.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
A while back, I noticed that the crypto and crypto API usage in big_keys
were entirely broken in multiple ways, so I rewrote it. Now, I'm
rewriting it again, but this time using the simpler ChaCha20Poly1305
library function. This makes the file considerably more simple; the
diffstat alone should justify this commit. It also should be faster,
since it no longer requires a mutex around the "aead api object" (nor
allocations), allowing us to encrypt multiple items in parallel. We also
benefit from being able to pass any type of pointer, so we can get rid
of the ridiculously complex custom page allocator that big_key really
doesn't need.
[DH: Change the select CRYPTO_LIB_CHACHA20POLY1305 to a depends on as
select doesn't propagate and the build can end up with an =y dependending
on some =m pieces.
The depends on CRYPTO also had to be removed otherwise the configurator
complains about a recursive dependency.]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: kernel-hardening@lists.openwall.com
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David Howells <dhowells@redhat.com>
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
We forget to call locks_move_blocks in posix_lock_inode when try to
process same owner and different types.
Signed-off-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Using "%zu" to fix following warning,
kernel/rcu/rcuperf.c: In function ‘kfree_perf_init’:
include/linux/kern_levels.h:5:18: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘unsigned int’ [-Wformat=]
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The commit is a nice cleanup, but breaks booting on exynos5 based
chromebooks. It's seems to come down to exynos5's i2c driver not
implementing I2C_FUNC_SMBUS_READ_BLOCK_DATA. It's not yet clear
why that breaks boot / massively slows it down when userspace
starts, so revert the problematic patch.
This reverts commit c4b12a2f3f.
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
This depends on the simplification of sbs_read_string_data, which
breaks booting exynos5 based chromebooks. More investigation is
required, so this patch and the simplification patch are reverted
for this merge window.
Note, that this is only a partial revert, since sbs_update_presence()
has not been removed. It is also required for the charger broadcast
disabling.
This reverts commit 79bcd5a4a6.
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
This commit moves channel picking code in separate function.
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
MS-SMB2 specification was updated in March. Make minor additions
and corrections to compression related definitions in smb2pdu.h
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Currently, 'KDBFLAGS' is an internal variable of kdb, it is combined
by 'KDBDEBUG' and state flags. It will be shown only when 'KDBDEBUG'
is set, and the user can define an environment variable named 'KDBFLAGS'
too. These are puzzling indeed.
After communication with Daniel, it seems that 'KDBFLAGS' is a misfeature.
So let's replace 'KDBFLAGS' with 'KDBDEBUG' to just show the value we
wrote into. After this modification, we can use `md4c1 kdb_flags` instead,
to observe the state flags.
Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/r/20200521072125.21103-1-liwei391@huawei.com
[daniel.thompson@linaro.org: Make kdb_flags unsigned to avoid arithmetic
right shift]
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Currently there is no guarantee that an earlycon will be initialized
before kgdboc tries to adopt it. Almost the opposite: on systems
with ACPI then if earlycon has no arguments then it is guaranteed that
earlycon will not be initialized.
This patch mitigates the problem by giving kgdboc_earlycon a second
chance during console_init(). This isn't quite as good as stopping during
early parameter parsing but it is still early in the kernel boot.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200430161741.1832050-1-daniel.thompson@linaro.org
Reviewed-by: Douglas Anderson <dianders@chromium.org>
The recent patch ("kgdboc: Add kgdboc_earlycon to support early kgdb
using boot consoles") adds a new kernel command line parameter.
Document it.
Note that the patch adding the feature does some comparing/contrasting
of "kgdboc_earlycon" vs. the existing "ekgdboc". See that patch for
more details, but briefly "ekgdboc" can be used _instead_ of "kgdboc"
and just makes "kgdboc" do its normal initialization early (only works
if your tty driver is already ready). The new "kgdboc_earlycon" works
in combination with "kgdboc" and is backed by boot consoles.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Link: https://lore.kernel.org/r/20200507130644.v4.9.I7d5eb42c6180c831d47aef1af44d0b8be3fac559@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
When I combined kgdboc_earlycon with an inflight patch titled ("soc:
qcom-geni-se: Add interconnect support to fix earlycon crash") [1]
things went boom. Specifically I got a crash during the transition
between kgdboc_earlycon and the main kgdboc that looked like this:
Call trace:
__schedule_bug+0x68/0x6c
__schedule+0x75c/0x924
schedule+0x8c/0xbc
schedule_timeout+0x9c/0xfc
do_wait_for_common+0xd0/0x160
wait_for_completion_timeout+0x54/0x74
rpmh_write_batch+0x1fc/0x23c
qcom_icc_bcm_voter_commit+0x1b4/0x388
qcom_icc_set+0x2c/0x3c
apply_constraints+0x5c/0x98
icc_set_bw+0x204/0x3bc
icc_put+0x30/0xf8
geni_remove_earlycon_icc_vote+0x6c/0x9c
qcom_geni_serial_earlycon_exit+0x10/0x1c
kgdboc_earlycon_deinit+0x38/0x58
kgdb_register_io_module+0x11c/0x194
configure_kgdboc+0x108/0x174
kgdboc_probe+0x38/0x60
platform_drv_probe+0x90/0xb0
really_probe+0x130/0x2fc
...
The problem was that we were holding the "kgdb_registration_lock"
while calling into code that didn't expect to be called in spinlock
context.
Let's slightly defer when we call the deinit code so that it's not
done under spinlock.
NOTE: this does mean that the "deinit" call of the old kgdb IO module
is now made _after_ the init of the new IO module, but presumably
that's OK.
[1] https://lkml.kernel.org/r/1588919619-21355-3-git-send-email-akashast@codeaurora.org
Fixes: 220995622d ("kgdboc: Add kgdboc_earlycon to support early kgdb using boot consoles")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200526142001.1.I523dc33f96589cb9956f5679976d402c8cda36fa@changeid
[daniel.thompson@linaro.org: Resolved merge issues by hand]
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
When kgdboc is compiled as a module all of the "ekgdboc" and
"kgdb_earlycon" code isn't useful and, in fact, breaks compilation.
This is because early_param() isn't defined for modules and that's how
this code gets configured.
It turns out that this was broken by commit eae3e19ca9 ("kgdboc:
Remove useless #ifdef CONFIG_KGDB_SERIAL_CONSOLE in kgdboc") and then
made worse by commit 220995622d ("kgdboc: Add kgdboc_earlycon to
support early kgdb using boot consoles"). I guess the #ifdef wasn't
so useless, even if it wasn't obvious why it was useful. When kgdboc
was compiled as a module only "CONFIG_KGDB_SERIAL_CONSOLE_MODULE" was
defined, not "CONFIG_KGDB_SERIAL_CONSOLE". That meant that the old
module.
Let's basically do the same thing that the old code (pre-removal of
the #ifdef) did but use "IS_BUILTIN(CONFIG_KGDB_SERIAL_CONSOLE)" to
make it more obvious what the point of the check is. We'll fix
kgdboc_earlycon in a similar way.
Fixes: 220995622d ("kgdboc: Add kgdboc_earlycon to support early kgdb using boot consoles")
Fixes: eae3e19ca9 ("kgdboc: Remove useless #ifdef CONFIG_KGDB_SERIAL_CONSOLE in kgdboc")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200519084345.1.I91670accc8a5ddabab227eb63bb4ad3e2e9d2b58@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Lee has kindly offered his help in sharing the patch review workload for
the PWM subsystem. If this works out well between Lee, Uwe and myself it
may be a good idea to maintain the subsystem as a group.
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
To not trigger the warnings provided by CONFIG_PWM_DEBUG
- use up-rounding in .get_state()
- don't divide by the result of a division
- don't use the rounded counter value for the period length to calculate
the counter value for the duty cycle
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
The way state->enabled is computed is rather convoluted and hard to
read - both branches of the if() actually do the exact same thing. So
remove the if(), and further simplify "<boolean condition> ? true :
false" to "<boolean condition>".
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Even in failed case of pm_runtime_get_sync(), the usage_count is
incremented. In order to keep the usage_count with correct value call
appropriate pm_runtime_put().
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>