The ibm,unit-sizes property was originally specified as an array of two
u32s corresponding to the memory block size, and the number of blocks
available in that region. A fairly last-minute change to the SCM DT
specification was splitting that into two seperate u64 properties:
ibm,block-sizes and ibm,number-of-blocks that convey the same
information. No firmware / hypervisor that emitted the ibm,unit-size
property ever appeared in the wild.
Fixes: b5beae5e22 ("powerpc/pseries: Add driver for PAPR SCM regions")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Use kernel types (u32/u64)]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Fix an off-by-one error in the memory resource range. This resource is
used to determine the address range of the memory to be hot-plugged as
ZONE_DEVICE memory. The current end address results in the kernel
attempting to map an additional memblock and the hypervisor may reject
the mapping resulting in the entire hot-plug failing.
Fixes: b5beae5e22 ("powerpc/pseries: Add driver for PAPR SCM regions")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Making PAPR_SCM select LIBNVDIMM results in circular dependencies in
Kconfig when another symbol depends on it. Fix this by replacing the
select with a depends.
Fixes: b5beae5e22 ("powerpc/pseries: Add driver for PAPR SCM regions")
Reported-by: Alastair D'Silva <alastair@d-silva.org>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Now that there are different variants of pt_regs for userspace and
kernel, the uapi for the BPF_PROG_TYPE_PERF_EVENT program type must be
changed by exporting the user_pt_regs structure instead of the pt_regs
structure that is in-kernel only.
Fixes: 002af9391b ("powerpc: Split user/kernel definitions of struct pt_regs")
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
In commit 5e9dcb6188 ("powerpc/boot: Expose Kconfig symbols to
wrapper") we added a dependency to serial.c on autoconf.h:
$(obj)/serial.c: $(obj)/autoconf.h
This works when building in-tree (ie. with KBUILD_OUTPUT unset)
because the obj tree is the src tree.
But when building with eg. O=build and -j 1 the build fails:
gcc ... -I../arch/powerpc/boot -c -o arch/powerpc/boot/serial.o arch/powerpc/boot/serial.c
gcc: error: arch/powerpc/boot/serial.c: No such file or directory
Why this is only happening with -j 1 is not clear, when building with
-j greater than 1 somehow we decide to look for serial.c in the src
tree (../), eg:
gcc -I../arch/powerpc/boot -c -o arch/powerpc/boot/serial.o ../arch/powerpc/boot/serial.c
Regardless we shouldn't be specifying a dependency on serial.c in the
build tree, we want to add a dependency to the version in $(srctree)
so fix the rule to say that.
Fixes: 5e9dcb6188 ("powerpc/boot: Expose Kconfig symbols to wrapper")
Tested-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit 78e5dfea84 ("powerpc: dts: replace 'linux,stdout-path' with
'stdout-path'") broke the default console on a number of embedded
PowerPC systems, because it failed to also update the code in
arch/powerpc/kernel/legacy_serial.c to look for that property in
addition to the old one.
This fixes it.
Fixes: 78e5dfea84 ("powerpc: dts: replace 'linux,stdout-path' with 'stdout-path'")
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The arch_teardown_msi_irqs() function assumes that controller ops
pointers were already checked in arch_setup_msi_irqs(), but this
assumption is wrong: arch_teardown_msi_irqs() can be called even when
arch_setup_msi_irqs() returns an error (-ENOSYS).
This can happen in the following scenario:
- msi_capability_init() calls pci_msi_setup_msi_irqs()
- pci_msi_setup_msi_irqs() returns -ENOSYS
- msi_capability_init() notices the error and calls free_msi_irqs()
- free_msi_irqs() calls pci_msi_teardown_msi_irqs()
This is easier to see when CONFIG_PCI_MSI_IRQ_DOMAIN is not set and
pci_msi_setup_msi_irqs() and pci_msi_teardown_msi_irqs() are just
aliases to arch_setup_msi_irqs() and arch_teardown_msi_irqs().
The call to free_msi_irqs() upon pci_msi_setup_msi_irqs() failure
seems legit, as it does additional cleanup; e.g.
list_del(&entry->list) and kfree(entry) inside free_msi_irqs() do
happen (MSI descriptors are allocated before pci_msi_setup_msi_irqs()
is called and need to be cleaned up if that fails).
Fixes: 6b2fd7efeb ("PCI/MSI/PPC: Remove arch_msi_check_device()")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Radu Rendec <radu.rendec@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
For some configs the build fails with:
arch/powerpc/mm/dump_linuxpagetables.c: In function 'populate_markers':
arch/powerpc/mm/dump_linuxpagetables.c:306:39: error: 'PKMAP_BASE' undeclared (first use in this function)
arch/powerpc/mm/dump_linuxpagetables.c:314:50: error: 'LAST_PKMAP' undeclared (first use in this function)
These come from highmem.h, including that fixes the build.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit 6975a783d7 ("powerpc/boot: Allow building the zImage wrapper
as a relocatable ET_DYN", 2011-04-12) changed the procedure descriptor
at the start of crt0.S to have a hard-coded start address of 0x500000
rather than a reference to _zimage_start, presumably because having
a reference to a symbol introduced a relocation which is awkward to
handle in a position-independent executable. Unfortunately, what is
at 0x500000 in the COFF image is not the first instruction, but the
procedure descriptor itself, that is, a word containing 0x500000,
which is not a valid instruction. Hence, booting a COFF zImage
results in a "DEFAULT CATCH!, code=FFF00700" message from Open
Firmware.
This fixes the problem by (a) putting the procedure descriptor in the
data section and (b) adding a branch to _zimage_start as the first
instruction in the program.
Fixes: 6975a783d7 ("powerpc/boot: Allow building the zImage wrapper as a relocatable ET_DYN")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Currently the selftest wild_bctr can fail to build when an old gcc is
used, notably on gcc using a binutils version <= 2.27, because the
assembler does not support the integer suffix UL.
This patch adjusts the wild_bctr test so the REG_POISON value is still
treated as an unsigned long for the shifts on compilation but the UL
suffix is absent on the stringification, so the inline asm code
generated has no UL suffixes.
Signed-off-by: Gustavo Romero <gromero@linux.vnet.ibm.com>
[mpe: Wrap long line]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit 4c2de74cc8 ("powerpc/64: Interrupts save PPR on stack rather
than thread_struct") changed sizeof(struct pt_regs) % 16 from 0 to 8,
which causes the interrupt frame allocation on kernel entry to put the
kernel stack out of alignment.
Quadword (16-byte) alignment for the stack is required by both the
64-bit v1 ABI (v1.9 § 3.2.2) and the 64-bit v2 ABI (v1.1 § 2.2.2.1).
Add a pad field to fix alignment, and add a BUILD_BUG_ON to catch this
in future.
Fixes: 4c2de74cc8 ("powerpc/64: Interrupts save PPR on stack rather than thread_struct")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When VPHN function is not supported and during cpu hotplug event,
kernel prints message 'VPHN function not supported. Disabling
polling...'. Currently it prints on every hotplug event, it floods
dmesg when a KVM guest tries to hotplug huge number of vcpus, let's
just print once and suppress further kernel prints.
Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The selftest I recently added to test branching to an out-of-bounds
NIP doesn't work on 64-bit big endian. It does fail but not in the
right way. That is it SEGVs trying to load from the opd at BAD_NIP,
but it never gets as far as branching to BAD_NIP.
To fix it we need to create an opd which is reachable but which holds
the bad address.
Fixes: b7683fc66e ("selftests/powerpc: Add a test of wild bctr")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Back in 2006 Ben added some workarounds for a misbehaviour in the
Spider IO bridge used on early Cell machines, see commit
014da7ff47 ("[POWERPC] Cell "Spider" MMIO workarounds"). Later these
were made to be generic, ie. not tied specifically to Spider.
The code stashes a token in the high bits (59-48) of virtual addresses
used for IO (eg. returned from ioremap()). This works fine when using
the Hash MMU, but when we're using the Radix MMU the bits used for the
token overlap with some of the bits of the virtual address.
This is because the maximum virtual address is larger with Radix, up
to c00fffffffffffff, and in fact we use that high part of the address
range for ioremap(), see RADIX_KERN_IO_START.
As it happens the bits that are used overlap with the bits that
differentiate an IO address vs a linear map address. If the resulting
address lies outside the linear mapping we will crash (see below), if
not we just corrupt memory.
virtio-pci 0000:00:00.0: Using 64-bit direct DMA at offset 800000000000000
Unable to handle kernel paging request for data at address 0xc000000080000014
...
CFAR: c000000000626b98 DAR: c000000080000014 DSISR: 42000000 IRQMASK: 0
GPR00: c0000000006c54fc c00000003e523378 c0000000016de600 0000000000000000
GPR04: c00c000080000014 0000000000000007 0fffffff000affff 0000000000000030
^^^^
...
NIP [c000000000626c5c] .iowrite8+0xec/0x100
LR [c0000000006c992c] .vp_reset+0x2c/0x90
Call Trace:
.pci_bus_read_config_dword+0xc4/0x120 (unreliable)
.register_virtio_device+0x13c/0x1c0
.virtio_pci_probe+0x148/0x1f0
.local_pci_probe+0x68/0x140
.pci_device_probe+0x164/0x220
.really_probe+0x274/0x3b0
.driver_probe_device+0x80/0x170
.__driver_attach+0x14c/0x150
.bus_for_each_dev+0xb8/0x130
.driver_attach+0x34/0x50
.bus_add_driver+0x178/0x2f0
.driver_register+0x90/0x1a0
.__pci_register_driver+0x6c/0x90
.virtio_pci_driver_init+0x2c/0x40
.do_one_initcall+0x64/0x280
.kernel_init_freeable+0x36c/0x474
.kernel_init+0x24/0x160
.ret_from_kernel_thread+0x58/0x7c
This hasn't been a problem because CONFIG_PPC_IO_WORKAROUNDS which
enables this code is usually not enabled. It is only enabled when it's
selected by PPC_CELL_NATIVE which is only selected by
PPC_IBM_CELL_BLADE and that in turn depends on BIG_ENDIAN. So in order
to hit the bug you need to build a big endian kernel, with IBM Cell
Blade support enabled, as well as Radix MMU support, and then boot
that on Power9 using Radix MMU.
Still we can fix the bug, so let's do that. We simply use fewer bits
for the token, taking the union of the restrictions on the address
from both Hash and Radix, we end up with 8 bits we can use for the
token. The only user of the token is iowa_mem_find_bus() which only
supports 8 token values, so 8 bits is plenty for that.
Fixes: 566ca99af0 ("powerpc/mm/radix: Add dummy radix_enabled()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
With preempt enabled we see warnings in do_slb_fault():
BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u33:0/98
futex hash table entries: 4096 (order: 3, 524288 bytes)
caller is do_slb_fault+0x204/0x230
CPU: 5 PID: 98 Comm: kworker/u33:0 Not tainted 4.19.0-rc3-gcc-7.3.1-00022-g1936f094e164 #138
Call Trace:
dump_stack+0xb4/0x104 (unreliable)
check_preemption_disabled+0x148/0x150
do_slb_fault+0x204/0x230
data_access_slb_common+0x138/0x180
This is caused by the get_paca() in slb_allocate_kernel(), which
includes a call to debug_smp_processor_id().
slb_allocate_kernel() can only be called from do_slb_fault(), and in
that path interrupts are hard disabled and so we can't be preempted,
but we can't update the preempt flags (in thread_info) because that
could cause an SLB fault.
So just use local_paca which is safe and doesn't cause the warning.
Fixes: 48e7b76957 ("powerpc/64s/hash: Convert SLB miss handlers to C")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
TRACE_INCLUDE_PATH and TRACE_INCLUDE_FILE are used by
<trace/define_trace.h>, so like that #include, they should
be outside #ifdef protection.
They also need to be #undefed before defining, in case multiple trace
headers are included by the same C file. This became the case on
book3e after commit cf4a608515 ("powerpc/mm: Add missing tracepoint for
tlbie"), leading to the following build error:
CC arch/powerpc/kvm/powerpc.o
In file included from arch/powerpc/kvm/powerpc.c:51:0:
arch/powerpc/kvm/trace.h:9:0: error: "TRACE_INCLUDE_PATH" redefined
[-Werror]
#define TRACE_INCLUDE_PATH .
^
In file included from arch/powerpc/kvm/../mm/mmu_decl.h:25:0,
from arch/powerpc/kvm/powerpc.c:48:
./arch/powerpc/include/asm/trace.h:224:0: note: this is the location of
the previous definition
#define TRACE_INCLUDE_PATH asm
^
cc1: all warnings being treated as errors
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Signed-off-by: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The slbfee instruction was only added in ISA 2.05 (Power6), it's not
supported on older CPUs. We don't have a CPU feature for that ISA
version though, so just use the ISA 2.06 feature flag.
Fixes: e15a4fea4d ("powerpc/64s/hash: Add some SLB debugging tests")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Old toolchains don't know about slbfee and break the build, eg:
{standard input}:37: Error: Unrecognized opcode: `slbfee.'
Fix it by using the macro version. We need to add an underscore
version that takes raw register numbers from the inline asm, rather
than our Rx macros.
Fixes: e15a4fea4d ("powerpc/64s/hash: Add some SLB debugging tests")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The code for assert_slb_exists() and assert_slb_notexists() is almost
identical, except for the polarity of the WARN_ON(). In a future patch
we'll need to modify this code, so consolidate it now into a single
function.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The NPU IOMMU is setup to mirror the parent PCIe device IOMMU
setup. Therefore it does not make sense to call dma operations such as
dma_map_page(), etc. directly on these devices. The existing dma_ops
simply print a warning if they are ever called, however this is
unnecessary and the warnings are likely to go unnoticed.
It is instead simpler to remove these operations and let the generic
DMA code print warnings (eg. via a NULL pointer deref) in cases of
buggy drivers attempting dma operations on NVLink devices.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
- Full filesystem authentication feature,
UBIFS is now able to have the whole filesystem structure
authenticated plus user data encrypted and authenticated.
- Minor cleanups
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAlvaF2IWHHJpY2hhcmRA
c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wUb/D/0Z/jN80LtxoIlQzmfoBnVSnaXv
BDvdDFHTwV+zu4XCvUyJzBnwzNjDxNK2XD5hAgiqCoTk5sr4KUi5+zfft5XMW40w
T1m5mQNhjwmcI/J/5m2gSHbOSB8Hkc0HIybknS+5ZJDa1OZUkxejLcmpK5Wk+bxp
Ak1cOn5GIJKRQMrUudhySkQaBe0DnNmHSACePSb5AYGlnRy6eJ26ANR2mU7PFg1V
NBVbOQjMrYIV9qq9m+vtTNsLXidcaRf474fg7lshodmDBISy9g83Oq8FaPzYTJVJ
rkvdsRzCrXeApSH2LJ8Gb1AvIAlvJa2Va+anXh8NrSBySfzTKrIPtIONkpF7zxOC
8naZcRNvTqWcMfaTKGK+SGWUqGlHxdGOo5NkkKrn0jsO6HJ8kYAXKFGx65MsiCLv
xPlKc543ZLSscw3JJqLXVoXr2hmwhUHMJwwaPngFmdgm88bog62feUgFpYOU/1dj
1s2+q3jSUqfuS4oInjAmeX/Yq9dss/6dMo73ikbekIGRtijUfCMBWFyINdE0oWPu
ZUdOOifYrozIG7wWEo6ZzCI1PIyPvYfKcXVMWimPmu9Xi5AnbCDMQmPYVF5YMM0R
jexN9gVyFQQjz940reFJi0EkIJjwCycWLWft6P6cLDc/rRUUP4ibNYv3JL8WvhHn
Eb9V6InXhcyuX4eopA==
=lq2m
-----END PGP SIGNATURE-----
Merge tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs
Pull UBIFS updates from Richard Weinberger:
- Full filesystem authentication feature, UBIFS is now able to have the
whole filesystem structure authenticated plus user data encrypted and
authenticated.
- Minor cleanups
* tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs: (26 commits)
ubifs: Remove unneeded semicolon
Documentation: ubifs: Add authentication whitepaper
ubifs: Enable authentication support
ubifs: Do not update inode size in-place in authenticated mode
ubifs: Add hashes and HMACs to default filesystem
ubifs: authentication: Authenticate super block node
ubifs: Create hash for default LPT
ubfis: authentication: Authenticate master node
ubifs: authentication: Authenticate LPT
ubifs: Authenticate replayed journal
ubifs: Add auth nodes to garbage collector journal head
ubifs: Add authentication nodes to journal
ubifs: authentication: Add hashes to index nodes
ubifs: Add hashes to the tree node cache
ubifs: Create functions to embed a HMAC in a node
ubifs: Add helper functions for authentication support
ubifs: Add separate functions to init/crc a node
ubifs: Format changes for authentication support
ubifs: Store read superblock node
ubifs: Drop write_node
...
Pull more timer updates from Thomas Gleixner:
"A set of commits for the new C-SKY architecture timers"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
dt-bindings: timer: gx6605s SOC timer
clocksource/drivers/c-sky: Add gx6605s SOC system timer
dt-bindings: timer: C-SKY Multi-processor timer
clocksource/drivers/c-sky: Add C-SKY SMP timer
private struct, and a few bug fixes.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJb2zRKAAoJEG5mS6x6i9IjMMQP/1Xf2x27O0nb2NoKCyDbBAL9
LXUR7NYiATg3BKOWZS1MVcxmhLlUYNUNsiHenzER++tIt0Jbx70k1oKJrskKPbNR
G3eN1uUhceioiSYDcE/02VERtwL3P+6RIBbiRcvzihv/nhPNtcD4mTI5nkM/5wPY
8I5PMzMJkQAezvYyXTPkk1TvycHOkFge+LNCtXRRRdemik5FIU3PlEM+8jLAFEmH
DpiYr+g4Nvl8U+aqwXc/ffvDP9Ky5iRq48ifbWORXUVGNTckuTWL6DpYByuOimXL
pzmIjzSc9tWJ/wXU71pHdTjQSRSixxdD7m2b91XLNBNvdz3G842E78vd1yH4Y8zp
9rTijHwfybhT3cUQnGJ+i6fw0TYUGhoQ/6/TXHDXFy6u01jIHVD30GFmRUpNH96O
egItTbnANnhh+XxZMZpwum8K0a9/NMP5DKInBfyXmiEwFzOR8QG23ufjL52RLWwj
KNgovXC+Y5NWpZ6PR2EP7UfHHe8ppZKO7RpWwe+1ZKkz1gFHEM6I3Es4AjNsPRO+
epBCetPc8Ib1NK3W9WCBa3LuuoZhSmym4jDuDejWsECCvrY/lNe5UIRtiH/1sI76
O6O0e6ncmUdJslJsY2KK2RWn1+tlORhcm8/1ARCVxr9nTgD0vRmPn82iX+L57Kih
nOnsHDBroOszI5xCvQ76
=EhpD
-----END PGP SIGNATURE-----
Merge tag 'ntb-4.20' of git://github.com/jonmason/ntb
Pull NTB updates from Jon Mason:
"Fairly minor changes and bug fixes:
NTB IDT thermal changes and hook into hwmon, ntb_netdev clean-up of
private struct, and a few bug fixes"
* tag 'ntb-4.20' of git://github.com/jonmason/ntb:
ntb: idt: Alter the driver info comments
ntb: idt: Discard temperature sensor IRQ handler
ntb: idt: Add basic hwmon sysfs interface
ntb: idt: Alter temperature read method
ntb_netdev: Simplify remove with client device drvdata
NTB: transport: Try harder to alloc an aligned MW buffer
ntb: ntb_transport: Mark expected switch fall-throughs
ntb: idt: Set PCIe bus address to BARLIMITx
NTB: ntb_hw_idt: replace IS_ERR_OR_NULL with regular NULL checks
ntb: intel: fix return value for ndev_vec_mask()
ntb_netdev: fix sleep time mismatch
Pull scheduler fixes from Ingo Molnar:
"A memory (under-)allocation fix and a comment fix"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/topology: Fix off by one bug
sched/rt: Update comment in pick_next_task_rt()
Pull x86 fixes from Ingo Molnar:
"A number of fixes and some late updates:
- make in_compat_syscall() behavior on x86-32 similar to other
platforms, this touches a number of generic files but is not
intended to impact non-x86 platforms.
- objtool fixes
- PAT preemption fix
- paravirt fixes/cleanups
- cpufeatures updates for new instructions
- earlyprintk quirk
- make microcode version in sysfs world-readable (it is already
world-readable in procfs)
- minor cleanups and fixes"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
compat: Cleanup in_compat_syscall() callers
x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT
objtool: Support GCC 9 cold subfunction naming scheme
x86/numa_emulation: Fix uniform-split numa emulation
x86/paravirt: Remove unused _paravirt_ident_32
x86/mm/pat: Disable preemption around __flush_tlb_all()
x86/paravirt: Remove GPL from pv_ops export
x86/traps: Use format string with panic() call
x86: Clean up 'sizeof x' => 'sizeof(x)'
x86/cpufeatures: Enumerate MOVDIR64B instruction
x86/cpufeatures: Enumerate MOVDIRI instruction
x86/earlyprintk: Add a force option for pciserial device
objtool: Support per-function rodata sections
x86/microcode: Make revision and processor flags world-readable
Pull perf updates and fixes from Ingo Molnar:
"These are almost all tooling updates: 'perf top', 'perf trace' and
'perf script' fixes and updates, an UAPI header sync with the merge
window versions, license marker updates, much improved Sparc support
from David Miller, and a number of fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (66 commits)
perf intel-pt/bts: Calculate cpumode for synthesized samples
perf intel-pt: Insert callchain context into synthesized callchains
perf tools: Don't clone maps from parent when synthesizing forks
perf top: Start display thread earlier
tools headers uapi: Update linux/if_link.h header copy
tools headers uapi: Update linux/netlink.h header copy
tools headers: Sync the various kvm.h header copies
tools include uapi: Update linux/mmap.h copy
perf trace beauty: Use the mmap flags table generated from headers
perf beauty: Wire up the mmap flags table generator to the Makefile
perf beauty: Add a generator for MAP_ mmap's flag constants
tools include uapi: Update asound.h copy
tools arch uapi: Update asm-generic/unistd.h and arm64 unistd.h copies
tools include uapi: Update linux/fs.h copy
perf callchain: Honour the ordering of PERF_CONTEXT_{USER,KERNEL,etc}
perf cs-etm: Correct CPU mode for samples
perf unwind: Take pgoff into account when reporting elf to libdwfl
perf top: Do not use overwrite mode by default
perf top: Allow disabling the overwrite mode
perf trace: Beautify mount's first pathname arg
...
Pull irq fixes from Ingo Molnar:
"An irqchip driver fix and a memory (over-)allocation fix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-mvebu-sei: Fix a NULL vs IS_ERR() bug in probe function
irq/matrix: Fix memory overallocation
With the addition of the NUMA identity level, we increased @level by
one and will run off the end of the array in the distance sort loop.
Fixed: 051f3ca02e ("sched/topology: Introduce NUMA identity node sched domain")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
A few fixes who have come in near or during the merge window:
- Removal of a VLA usage in Marvell mpp platform code
- Enable some IPMI options for ARM64 servers by default, helps testing
- Enable PREEMPT on 32-bit ARMv7 defconfig
- Minor fix for stm32 DT (removal of an unused DMA property)
- Bugfix for TI OMAP1-based ams-delta (-EINVAL -> IRQ_NOTCONNECTED)
-----BEGIN PGP SIGNATURE-----
iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAlvd6+YPHG9sb2ZAbGl4
b20ubmV0AAoJEIwa5zzehBx3xwUP/j3AFLgK7HPGzDHy6hqVBuezXaOtkMFDbfmS
quIpp60zoellKREIGQag06VI44IMtetfJpcLe6GFNAtBNg7ofQ4/H3jq4eFqP+Gd
abJC/iSzP86UKnAGoexkEQ4CxzYb+q6Okv3yBROWptFfvfDE9wyeKdl6SoSPYuvy
yZPmZdq2HguvjWfKz7BLGvr6nzMdwwj1MpvqsSS3xLCiQTBSq1VVgKyCSWin67r2
FiD1I+c37WNaJQP0eQ+Eu6mCc++jK3cdHPAp6LzmJ7YNVOqPCwU2KroY/DKLhz3v
oanKJENFlyNOunzdd2SuoM7TyFgo7rHDR1sQgKCbcLu3D32dPYg4T8+jr1xa3hRs
CFLFj9JKl6grZK1sIB35blmZ+6DJ7WfN1SRwD7tfvRM8UNxCUEs/5/Pf2ySU2cGx
k2ZiP2D+VS+eX41bZTiuJZOyX2Wsyis8TWTjCy8eT6uM1V4Zt4qiXigQW8qFg4RC
ta3Y2rUxjk8FZCV1I4mrIwkqsSL47fWuxtVxDnJP4tkA5HuVS+NJj8aneAYyJKiY
N6Gk1U8IJN40psk8OU7fwYUjJ6vhOgKxNo1nl7vf26Ddwl1fFsDHvFKgh9UmVDN5
boRSs+62AZVLTDdIpkz6hWSVPUO4lysi7n2Gi+qFjVifyE5TLH7ncSMZk6fbnLLI
v90sz43w
=mr8i
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"A few fixes who have come in near or during the merge window:
- Removal of a VLA usage in Marvell mpp platform code
- Enable some IPMI options for ARM64 servers by default, helps
testing
- Enable PREEMPT on 32-bit ARMv7 defconfig
- Minor fix for stm32 DT (removal of an unused DMA property)
- Bugfix for TI OMAP1-based ams-delta (-EINVAL -> IRQ_NOTCONNECTED)"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: stm32: update HASH1 dmas property on stm32mp157c
ARM: orion: avoid VLA in orion_mpp_conf
ARM: defconfig: Update multi_v7 to use PREEMPT
arm64: defconfig: Enable some IPMI configs
soc: ti: QMSS: Fix usage of irq_set_affinity_hint
ARM: OMAP1: ams-delta: Fix impossible .irq < 0
- Fix W+X page (mark RO) allocated by the arm64 kprobes code
- Makefile fix for .i files in out of tree modules
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlvdeMUACgkQa9axLQDI
XvGVAQ/8Dd6M1NwDsPqdemdWVOGJJi+c+Ou39c29dCvPkZV63ZgPBaQiOX5DJ18T
TUjC+Q2zW4Oag86q4N0REOQiafDfScNU/ZsDgnfHvagKNc6+V1lqzb8DsteeeCW9
YOPnbL/VV+dBKKXphjW23VQfsz5ryDU6HKoDHRgOOtHisnTOKJGj/HCXzn1LY6x4
us/Gl6U/kJyRs0/7F8lmfSatDK2o+bKo0/0X6OV7dNE3bo++rpWxCX8D/dcBiEwV
BZDkWu5noglnzYz/LwobYwIjshd6cNjSjJKgoudp3+6WtcFGiK347HDQyo6WqBSd
5hmo/R0My5SUWrwb3GVmxFQmDDxIwywneSkKdx00PNygoNBhu7VYOrf7C/8NOl2h
a0lMCl1Q9x+/2ZDWHhgcwZ6Nfkj/3hJ/3jQVtfqt7ldXgPmZQPrBcx0+CzjGAAiK
gmIpr7VH701KkQGMljV4W0AurWx4v/+YpewkSODBOcbEQTd6trl8I5+A0SA+o6eC
F479l8meU9H0vf9fMB1bkRxBipyaFRKNaTuabO3wHN45C4fzQCXQi5pjfvyAfC+f
zZbnTKeWVzAafnYGcS6Fml+hUD3QQdARnd3WDOyzwBC7EvZM3gGmFWnlkMxbgSuV
9c9+t7fMLChQiKUZ+bjNQpaXZ0YmA9+1fRqb9xCOP1/Ll7933A0=
=hH50
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull more arm64 updates from Catalin Marinas:
- fix W+X page (mark RO) allocated by the arm64 kprobes code
- Makefile fix for .i files in out of tree modules
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kprobe: make page to RO mode when allocate it
arm64: kdump: fix small typo
arm64: makefile fix build of .i file in external module case
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAlvdEHkACgkQiiy9cAdy
T1ERqgwAoaYkwZhee/73MxC5N2JnBpRjP8Wwu6jo7q89DY35zI8NvLfy48VXNmnk
8Sp+zPWXxFgFLxBIGIOEwvfrT2kl84lFHprAAkZRKQcz5ZdeqIgoKYWw46Ezglxn
pyBpGFKNVKVSb0ZbAJgswB0KTCSqW2Hx6mf/lGxx1c2ao1kyXrvN4J2v9bK2UCbV
KC9xAgkkteypiTsB/afWQMAybDtran1TdOS7rm4aNT697KJ4jOPSQ7Bhh4CxT+UC
xSFtobCc/v42Si3oq0DNmFNQOdfj40Tg3RjHRHBOmcqN2b+kmGa/4mCUbVsFL08G
8gpeAHpRDnVwBAQsa6y5LuojOQ0szJzkyM9SZHTyCG3sQRGX3b6L+CYafobzrxnk
81Z0PtnDNvaoqG/6th/cQX/vz9LoiO5KG2dF1c9z+mK4m4D7JObuxaqcsrpb0uZl
fWn8AU900QOP/MrciWhj3II+pyYOV3iqKp0cpUgrP1a2P/Qxu3FkXGc4he/lWfKB
vSXGXf88
=Ilu6
-----END PGP SIGNATURE-----
Merge tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes and updates from Steve French:
"Three small fixes (one Kerberos related, one for stable, and another
fixes an oops in xfstest 377), two helpful debugging improvements,
three patches for cifs directio and some minor cleanup"
* tag '4.20-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix signed/unsigned mismatch on aio_read patch
cifs: don't dereference smb_file_target before null check
CIFS: Add direct I/O functions to file_operations
CIFS: Add support for direct I/O write
CIFS: Add support for direct I/O read
smb3: missing defines and structs for reparse point handling
smb3: allow more detailed protocol info on open files for debugging
smb3: on kerberos mount if server doesn't specify auth type use krb5
smb3: add trace point for tree connection
cifs: fix spelling mistake, EACCESS -> EACCES
cifs: fix return value for cifs_listxattr
Pull 9p fix from Al Viro:
"Regression fix for net/9p handling of iov_iter; broken by braino when
switching to iov_iter_is_kvec() et.al., spotted and fixed by Marc"
* 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
iov_iter: Fix 9p virtio breakage
This is a set of minor small (and safe changes) that didn't make the
initial pull request plus some bug fixes.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCW9zD2CYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaMPAQDwYDbt
wsCd4SlAPz1PzAV25GL3zeBpk5UeCjv2/nIzEgEA6P+poJC85aVznrjNQ2kUyWdS
h/NrvG9UJmvVngKpDWQ=
=FHI8
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"This is a set of minor small (and safe changes) that didn't make the
initial pull request plus some bug fixes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mvsas: Remove set but not used variable 'id'
scsi: qla2xxx: Remove two arguments from qlafx00_error_entry()
scsi: qla2xxx: Make sure that qlafx00_ioctl_iosb_entry() initializes 'res'
scsi: qla2xxx: Remove a set-but-not-used variable
scsi: qla2xxx: Make qla2x00_sysfs_write_nvram() easier to analyze
scsi: qla2xxx: Declare local functions 'static'
scsi: qla2xxx: Improve several kernel-doc headers
scsi: qla2xxx: Modify fall-through annotations
scsi: 3w-sas: 3w-9xxx: Use unsigned char for cdb
scsi: mvsas: Use dma_pool_zalloc
scsi: target: Don't request modules that aren't even built
scsi: target: Set response length for REPORT TARGET PORT GROUPS
Merge more updates from Andrew Morton:
- more ocfs2 work
- various leftovers
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
memory_hotplug: cond_resched in __remove_pages
bfs: add sanity check at bfs_fill_super()
kernel/sysctl.c: remove duplicated include
kernel/kexec_file.c: remove some duplicated includes
mm, thp: consolidate THP gfp handling into alloc_hugepage_direct_gfpmask
ocfs2: fix clusters leak in ocfs2_defrag_extent()
ocfs2: dlmglue: clean up timestamp handling
ocfs2: don't put and assigning null to bh allocated outside
ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry
ocfs2: don't use iocb when EIOCBQUEUED returns
ocfs2: without quota support, avoid calling quota recovery
ocfs2: remove ocfs2_is_o2cb_active()
mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings
include/linux/notifier.h: SRCU: fix ctags
mm: handle no memcg case in memcg_kmem_charge() properly
We have received a bug report that unbinding a large pmem (>1TB) can
result in a soft lockup:
NMI watchdog: BUG: soft lockup - CPU#9 stuck for 23s! [ndctl:4365]
[...]
Supported: Yes
CPU: 9 PID: 4365 Comm: ndctl Not tainted 4.12.14-94.40-default #1 SLE12-SP4
Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.01.00.0833.051120182255 05/11/2018
task: ffff9cce7d4410c0 task.stack: ffffbe9eb1bc4000
RIP: 0010:__put_page+0x62/0x80
Call Trace:
devm_memremap_pages_release+0x152/0x260
release_nodes+0x18d/0x1d0
device_release_driver_internal+0x160/0x210
unbind_store+0xb3/0xe0
kernfs_fop_write+0x102/0x180
__vfs_write+0x26/0x150
vfs_write+0xad/0x1a0
SyS_write+0x42/0x90
do_syscall_64+0x74/0x150
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7fd13166b3d0
It has been reported on an older (4.12) kernel but the current upstream
code doesn't cond_resched in the hot remove code at all and the given
range to remove might be really large. Fix the issue by calling
cond_resched once per memory section.
Link: http://lkml.kernel.org/r/20181031125840.23982-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Dan Williams <dan.j.williams@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
syzbot is reporting too large memory allocation at bfs_fill_super() [1].
Since file system image is corrupted such that bfs_sb->s_start == 0,
bfs_fill_super() is trying to allocate 8MB of continuous memory. Fix
this by adding a sanity check on bfs_sb->s_start, __GFP_NOWARN and
printf().
[1] https://syzkaller.appspot.com/bug?id=16a87c236b951351374a84c8a32f40edbc034e96
Link: http://lkml.kernel.org/r/1525862104-3407-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+71c6b5d68e91149fc8a4@syzkaller.appspotmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Tigran Aivazian <aivazian.tigran@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove one include of <linux/pipe_fs_i.h>.
No functional changes.
Link: http://lkml.kernel.org/r/20181004134223.17735-1-michael@schupikov.de
Signed-off-by: Michael Schupikov <michael@schupikov.de>
Reviewed-by: Richard Weinberger <richard@nod.at>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We include kexec.h and slab.h twice in kexec_file.c. It's unnecessary.
hence just remove them.
Link: http://lkml.kernel.org/r/1537498098-19171-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
THP allocation mode is quite complex and it depends on the defrag mode.
This complexity is hidden in alloc_hugepage_direct_gfpmask from a large
part currently. The NUMA special casing (namely __GFP_THISNODE) is
however independent and placed in alloc_pages_vma currently. This both
adds an unnecessary branch to all vma based page allocation requests and
it makes the code more complex unnecessarily as well. Not to mention
that e.g. shmem THP used to do the node reclaiming unconditionally
regardless of the defrag mode until recently. This was not only
unexpected behavior but it was also hardly a good default behavior and I
strongly suspect it was just a side effect of the code sharing more than
a deliberate decision which suggests that such a layering is wrong.
Get rid of the thp special casing from alloc_pages_vma and move the
logic to alloc_hugepage_direct_gfpmask. __GFP_THISNODE is applied to the
resulting gfp mask only when the direct reclaim is not requested and
when there is no explicit numa binding to preserve the current logic.
Please note that there's also a slight difference wrt MPOL_BIND now. The
previous code would avoid using __GFP_THISNODE if the local node was
outside of policy_nodemask(). After this patch __GFP_THISNODE is avoided
for all MPOL_BIND policies. So there's a difference that if local node
is actually allowed by the bind policy's nodemask, previously
__GFP_THISNODE would be added, but now it won't be. From the behavior
POV this is still correct because the policy nodemask is used.
Link: http://lkml.kernel.org/r/20180925120326.24392-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ocfs2_defrag_extent() might leak allocated clusters. When the file
system has insufficient space, the number of claimed clusters might be
less than the caller wants. If that happens, the original code might
directly commit the transaction without returning clusters.
This patch is based on code in ocfs2_add_clusters_in_btree().
[akpm@linux-foundation.org: include localalloc.h, reduce scope of data_ac]
Link: http://lkml.kernel.org/r/20180904041621.16874-3-lchen@suse.com
Signed-off-by: Larry Chen <lchen@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The handling of timestamps outside of the 1970..2038 range in the dlm
glue is rather inconsistent: on 32-bit architectures, this has always
wrapped around to negative timestamps in the 1902..1969 range, while on
64-bit kernels all timestamps are interpreted as positive 34 bit numbers
in the 1970..2514 year range.
Now that the VFS code handles 64-bit timestamps on all architectures, we
can make the behavior more consistent here, and return the same result
that we had on 64-bit already, making the file system y2038 safe in the
process. Outside of dlmglue, it already uses 64-bit on-disk timestamps
anway, so that part is fine.
For consistency, I'm changing ocfs2_pack_timespec() to clamp anything
outside of the supported range to the minimum and maximum values. This
avoids a possible ambiguity of values before 1970 in particular, which
used to be interpreted as times at the end of the 2514 range previously.
Link: http://lkml.kernel.org/r/20180619155826.4106487-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ocfs2_read_blocks() and ocfs2_read_blocks_sync() are both used to read
several blocks from disk. Currently, the input argument *bhs* can be
NULL or NOT. It depends on the caller's behavior. If the function
fails in reading blocks from disk, the corresponding bh will be assigned
to NULL and put.
Obviously, above process for non-NULL input bh is not appropriate.
Because the caller doesn't even know its bhs are put and re-assigned.
If buffer head is managed by caller, ocfs2_read_blocks and
ocfs2_read_blocks_sync() should not evaluate it to NULL. It will cause
caller accessing illegal memory, thus crash.
Link: http://lkml.kernel.org/r/HK2PR06MB045285E0F4FBB561F9F2F9B3D5680@HK2PR06MB0452.apcprd06.prod.outlook.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reviewed-by: Guozhonghua <guozhonghua@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Somehow, file system metadata was corrupted, which causes
ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().
According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry. But
there is obviouse misuse of brelse around related code.
After failure of ocfs2_check_dir_entry(), current code just moves to
next position and uses the problematic buffer head again and again
during which the problematic buffer head is released for multiple times.
I suppose, this a serious issue which is long-lived in ocfs2. This may
cause other file systems which is also used in a the same host insane.
So we should also consider about bakcporting this patch into linux
-stable.
Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Suggested-by: Changkuo Shi <shi.changkuo@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When -EIOCBQUEUED returns, it means that aio_complete() will be called
from dio_complete(), which is an asynchronous progress against
write_iter. Generally, IO is a very slow progress than executing
instruction, but we still can't take the risk to access a freed iocb.
And we do face a BUG crash issue. Using the crash tool, iocb is
obviously freed already.
crash> struct -x kiocb ffff881a350f5900
struct kiocb {
ki_filp = 0xffff881a350f5a80,
ki_pos = 0x0,
ki_complete = 0x0,
private = 0x0,
ki_flags = 0x0
}
And the backtrace shows:
ocfs2_file_write_iter+0xcaa/0xd00 [ocfs2]
aio_run_iocb+0x229/0x2f0
do_io_submit+0x291/0x540
SyS_io_submit+0x10/0x20
system_call_fastpath+0x16/0x75
Link: http://lkml.kernel.org/r/1523361653-14439-1-git-send-email-ge.changwei@h3c.com
Signed-off-by: Changwei Ge <ge.changwei@h3c.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During one dead node's recovery by other node, quota recovery work will
be queued. We should avoid calling quota when it is not supported, so
check the quota flags.
Link: http://lkml.kernel.org/r/71604351584F6A4EBAE558C676F37CA401071AC9FB@H3CMLB12-EX.srv.huawei-3com.com
Signed-off-by: guozhonghua <guozhonghua@h3c.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>