Fix following warnings:
srmmu.c:870:13: warning: symbol 'srmmu_paging_init' was not declared. Should it be static?
iommu.c:430:13: warning: symbol 'ld_mmu_iommu' was not declared. Should it be static?
leon_mm.c:21:5: warning: symbol 'srmmu_swprobe_trace' was not declared. Should it be static?
Add proper prototypes or define static to fix them.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix following warnings:
srmmu.c:78:5: warning: symbol 'flush_page_for_dma_global' was not declared. Should it be static?
srmmu.c:85:5: warning: symbol 'viking_mxcc_present' was not declared. Should it be static?
srmmu.c:103:6: warning: symbol 'srmmu_nocache_bitmap' was not declared. Should it be static?
srmmu.c:176:24: warning: Using plain integer as NULL pointer
srmmu.c:731:46: warning: Using plain integer as NULL pointer
srmmu.c:731:46: warning: Using plain integer as NULL pointer
srmmu.c:731:46: warning: Using plain integer as NULL pointer
srmmu.c:870:13: warning: symbol 'srmmu_paging_init' was not declared. Should it be static?
Add proper prototypes in mm_32.h and drop local prototype in init_32.c
Replace 0 with NULL
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix following warning:
init_32.c:112:22: warning: symbol 'bootmem_init' was not declared. Should it be static?
Fix by adding a proper prototype in pgtable_32.h and drop
the local prototype in srmmu.c
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix following warning:
fault_32.c:38:24: error: symbol 'unhandled_fault' redeclared with different type - different modifiers
When this warning was fixed several new warnings popped up - fix them too.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This file will be used for more than just srmmu stuff, so the old name was misleading.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
loading a module and enabling function tracing at the same time.
He uncovered a race where the module when loaded will convert the
calls to mcount into nops, and expects the module's text to be RW.
But when function tracing is enabled, it will convert all kernel
text (core and module) from RO to RW to convert the nops to calls
to ftrace to record the function. After the convertion, it will
convert all the text back from RW to RO.
The issue is, it will also convert the module's text that is loading.
If it converts it to RO before ftrace does its conversion, it will
cause ftrace to fail and require a reboot to fix it again.
This patch moves the ftrace module update that converts calls to mcount
into nops to be done when the module state is still MODULE_STATE_UNFORMED.
This will ignore the module when the text is being converted from
RW back to RO.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJTXuHsAAoJEKQekfcNnQGuT7cIAJQhwX2fpdFr5eHwx0CyFo5c
75V0xcRhJsGeXqfgekkRhCHYEfL7v4sl6D+Bj8qzLG/0QresF9jVSMUTTZqYFpFc
t7f3oDDtdCmfofD/uyS7YOQ3JhU5ijo+Drzq8qRYtWNJJ0WCqbddpevcUiW1Zbvr
LAT3lcb+2I5Y1Jnyfd920+0plAnoeOw1/BPuRVJINwh8zeyvWnmp3iq9fOPdhMQQ
VhCCg+C2ILBPrCPFdwC5pVrL4a/CjyNd+LqtFXjLS9sO8s5KyUGkqKkbHMlhZeot
uRWlZUSNZsh/jpP4X2b+dtYGQ4Rrnp253a594Kmrzm/MPdsAV62oDqOfN0tzm7w=
=K59a
-----END PGP SIGNATURE-----
Merge tag 'trace-fixes-v3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace bugfix from Steven Rostedt:
"Takao Indoh reported that he was able to cause a ftrace bug while
loading a module and enabling function tracing at the same time.
He uncovered a race where the module when loaded will convert the
calls to mcount into nops, and expects the module's text to be RW.
But when function tracing is enabled, it will convert all kernel text
(core and module) from RO to RW to convert the nops to calls to ftrace
to record the function. After the convertion, it will convert all the
text back from RW to RO.
The issue is, it will also convert the module's text that is loading.
If it converts it to RO before ftrace does its conversion, it will
cause ftrace to fail and require a reboot to fix it again.
This patch moves the ftrace module update that converts calls to
mcount into nops to be done when the module state is still
MODULE_STATE_UNFORMED. This will ignore the module when the text is
being converted from RW back to RO"
* tag 'trace-fixes-v3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace/module: Hardcode ftrace_module_init() call into load_module()
This branch contains a pair of important bug fixes for the DT code:
- Fix some incorrect binding property names before they enter common usage
- Fix bug where some platform devices will be unable to get their
interrupt number when they depend on an interrupt controller that is
not available at device creation time. This is a problem causing
mainline to fail on a number of ARM platforms.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJTXr8WAAoJEMWQL496c2LNhzgP/RhhIS9ySzJkdPEMksnBaQbN
JKG+wBwlzvtJSPw2xX5EpWweOd1PDqHxLjjWu7q5WIrQQyOGRAELkYg1H+5ksFo1
obCAFOjTC7VXPGc+BX0dLN6Eq4UHP8k3Ui8zlMSZQNt+gBvYo1gX2ulR8sQgS8rF
xw+8iNDA2GIFusQbci35yOAhp6yo/ble3KIeR5dFKeMpiVSh9rqdLS+H9HgnG5Sw
G2LUrAeRY2yQbow3c1001HiKrJieZZSTCq8oiZDgJZY5+Qk8zamyj2xxgMoJUA8Q
2Kcb3379vNvDgjYX61NiwyrhIbJEiao8IXq6WMDNbmH3+FUwogrnh/A/g5POtX0N
JV5tGkc/2lfm5oBCFSzcw661XjLysPsm3nPLhnxRL/dlBVkl0IZzTVYPTqUtoKlK
NfwKFUDgg+QVmhFQT0BDCyOGAcRsz4rNEPPB6M12OmEKgxGpetfUaFdBcJqSNaxf
Ac6AAZCXrkLlRNOVhK333ILPxv1JRC++JAeRKNwHr9e6j2aJjUgilSuzpHJ5Lkyo
sj6kp+n1n+giYlgc9GjM7b33tm8XEd4rbcgwCWwaypEsl4FIwPs9xMys7md8SOCE
AAnVPVUvLGC9O2GhldeQlOgtGWzILW3zJXb2mrMIIGyLBGBx1qr3VJIZpxoddTc0
UtIJn0X37mKY/R4/hWs4
=B2JT
-----END PGP SIGNATURE-----
Merge tag 'dt-for-linus' of git://git.secretlab.ca/git/linux
Pull devicetree bug fixes from Grant Likely:
"These are some important bug fixes that need to get into v3.15.
This branch contains a pair of important bug fixes for the DT code:
- Fix some incorrect binding property names before they enter common
usage
- Fix bug where some platform devices will be unable to get their
interrupt number when they depend on an interrupt controller that
is not available at device creation time. This is a problem
causing mainline to fail on a number of ARM platforms"
* tag 'dt-for-linus' of git://git.secretlab.ca/git/linux:
of/irq: do irq resolution in platform_get_irq
of: selftest: add deferred probe interrupt test
dt: Fix binding typos in clock-names and interrupt-names
Pull powerpc fixes from Ben Herrenschmidt:
"Here is a bunch of post-merge window fixes that have been accumulating
in patchwork while I was on vacation or buried under other stuff last
week.
We have the now usual batch of LE fixes from Anton (sadly some new
stuff that went into this merge window had endian issues, we'll try to
make sure we do better next time)
Some fixes and cleanups to the new 24x7 performance monitoring stuff
(mostly typos and cleaning up printk's)
A series of fixes for an issue with our runlatch bit, which wasn't set
properly for offlined threads/cores and under KVM, causing potentially
some counters to misbehave along with possible power management
issues.
A fix for kexec nasty race where the new kernel wouldn't "see" the
secondary processors having reached back into firmware in time.
And finally a few other misc (and pretty simple) bug fixes"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (33 commits)
powerpc/4xx: Fix section mismatch in ppc4xx_pci.c
ppc/kvm: Clear the runlatch bit of a vcpu before napping
ppc/kvm: Set the runlatch bit of a CPU just before starting guest
ppc/powernv: Set the runlatch bits correctly for offline cpus
powerpc/pseries: Protect remove_memory() with device hotplug lock
powerpc: Fix error return in rtas_flash module init
powerpc: Bump BOOT_COMMAND_LINE_SIZE to 2048
powerpc: Bump COMMAND_LINE_SIZE to 2048
powerpc: Rename duplicate COMMAND_LINE_SIZE define
powerpc/perf/hv-24x7: Catalog version number is be64, not be32
powerpc/perf/hv-24x7: Remove [static 4096], sparse chokes on it
powerpc/perf/hv-24x7: Use (unsigned long) not (u32) values when calling plpar_hcall_norets()
powerpc/perf/hv-gpci: Make device attr static
powerpc/perf/hv_gpci: Probe failures use pr_debug(), and padding reduced
powerpc/perf/hv_24x7: Probe errors changed to pr_debug(), padding fixed
powerpc/mm: Fix tlbie to add AVAL fields for 64K pages
powerpc/powernv: Fix little endian issues in OPAL dump code
powerpc/powernv: Create OPAL sglist helper functions and fix endian issues
powerpc/powernv: Fix little endian issues in OPAL error log code
powerpc/powernv: Fix little endian issues with opal_do_notifier calls
...
BUG_ON() is a big hammer, and should be used _only_ if there is some
major corruption that you cannot possibly recover from, making it
imperative that the current process (and possibly the whole machine) be
terminated with extreme prejudice.
The trivial sanity check in the vmacache code is *not* such a fatal
error. Recovering from it is absolutely trivial, and using BUG_ON()
just makes it harder to debug for no actual advantage.
To make matters worse, the placement of the BUG_ON() (only if the range
check matched) actually makes it harder to hit the sanity check to begin
with, so _if_ there is a bug (and we just got a report from Srivatsa
Bhat that this can indeed trigger), it is harder to debug not just
because the machine is possibly dead, but because we don't have better
coverage.
BUG_ON() must *die*. Maybe we should add a checkpatch warning for it,
because it is simply just about the worst thing you can ever do if you
hit some "this cannot happen" situation.
Reported-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Davidlohr Bueso <davidlohr@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A race exists between module loading and enabling of function tracer.
CPU 1 CPU 2
----- -----
load_module()
module->state = MODULE_STATE_COMING
register_ftrace_function()
mutex_lock(&ftrace_lock);
ftrace_startup()
update_ftrace_function();
ftrace_arch_code_modify_prepare()
set_all_module_text_rw();
<enables-ftrace>
ftrace_arch_code_modify_post_process()
set_all_module_text_ro();
[ here all module text is set to RO,
including the module that is
loading!! ]
blocking_notifier_call_chain(MODULE_STATE_COMING);
ftrace_init_module()
[ tries to modify code, but it's RO, and fails!
ftrace_bug() is called]
When this race happens, ftrace_bug() will produces a nasty warning and
all of the function tracing features will be disabled until reboot.
The simple solution is to treate module load the same way the core
kernel is treated at boot. To hardcode the ftrace function modification
of converting calls to mcount into nops. This is done in init/main.c
there's no reason it could not be done in load_module(). This gives
a better control of the changes and doesn't tie the state of the
module to its notifiers as much. Ftrace is special, it needs to be
treated as such.
The reason this would work, is that the ftrace_module_init() would be
called while the module is in MODULE_STATE_UNFORMED, which is ignored
by the set_all_module_text_ro() call.
Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.com
Reported-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@vger.kernel.org # 2.6.38+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
This patch fixes this section mismatch:
WARNING: vmlinux.o(.text+0x1efc4): Section mismatch in reference from
the function apm821xx_pciex_init_port_hw() to the function
.init.text:ppc4xx_pciex_wait_on_sdr.isra.9()
The function apm821xx_pciex_init_port_hw() references the function
__init ppc4xx_pciex_wait_on_sdr.isra.9(). This is often because
apm821xx_pciex_init_port_hw lacks a __init annotation or the
annotation of ppc4xx_pciex_wait_on_sdr.isra.9 is wrong.
apm821xx_pciex_init_port_hw is only referenced by a struct in
__initdata, so it should be safe to add __init to
apm821xx_pciex_init_port_hw.
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When the guest cedes the vcpu or the vcpu has no guest to
run it naps. Clear the runlatch bit of the vcpu before
napping to indicate an idle cpu.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The secondary threads in the core are kept offline before launching guests
in kvm on powerpc: "371fefd6f2dc4666:KVM: PPC: Allow book3s_hv guests to use
SMT processor modes."
Hence their runlatch bits are cleared. When the secondary threads are called
in to start a guest, their runlatch bits need to be set to indicate that they
are busy. The primary thread has its runlatch bit set though, but there is no
harm in setting this bit once again. Hence set the runlatch bit for all
threads before they start guest.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Up until now we have been setting the runlatch bits for a busy CPU and
clearing it when a CPU enters idle state. The runlatch bit has thus
been consistent with the utilization of a CPU as long as the CPU is online.
However when a CPU is hotplugged out the runlatch bit is not cleared. It
needs to be cleared to indicate an unused CPU. Hence this patch has the
runlatch bit cleared for an offline CPU just before entering an idle state
and sets it immediately after it exits the idle state.
Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
While testing memory hot-remove, I found following dead lock:
Process #1141 is drmgr, trying to remove some memory, i.e. memory499.
It holds the memory_hotplug_mutex, and blocks when trying to remove file
"online" under dir memory499, in kernfs_drain(), at
wait_event(root->deactivate_waitq,
atomic_read(&kn->active) == KN_DEACTIVATED_BIAS);
Process #1120 is trying to online memory499 by
echo 1 > memory499/online
In .kernfs_fop_write, it uses kernfs_get_active() to increase
&kn->active, thus blocking process #1141. While itself is blocked later
when trying to acquire memory_hotplug_mutex, which is held by process
The backtrace of both processes are shown below:
[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c000000000263ca4>] .online_pages+0x74/0x7b0
[<c00000000055b40c>] .memory_subsys_online+0x9c/0x150
[<c00000000053cbe8>] .device_online+0xb8/0x120
[<c00000000053cd04>] .online_store+0xb4/0xc0
[<c000000000538ce4>] .dev_attr_store+0x64/0xa0
[<c00000000030f4ec>] .sysfs_kf_write+0x7c/0xb0
[<c00000000030e574>] .kernfs_fop_write+0x154/0x1e0
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c
[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c00000000030be14>] .__kernfs_remove+0x204/0x300
[<c00000000030d428>] .kernfs_remove_by_name_ns+0x68/0xf0
[<c00000000030fb38>] .sysfs_remove_file_ns+0x38/0x60
[<c000000000539354>] .device_remove_attrs+0x54/0xc0
[<c000000000539fd8>] .device_del+0x158/0x250
[<c00000000053a104>] .device_unregister+0x34/0xa0
[<c00000000055bc14>] .unregister_memory_section+0x164/0x170
[<c00000000024ee18>] .__remove_pages+0x108/0x4c0
[<c00000000004b590>] .arch_remove_memory+0x60/0xc0
[<c00000000026446c>] .remove_memory+0x8c/0xe0
[<c00000000007f9f4>] .pseries_remove_memblock+0xd4/0x160
[<c00000000007fcfc>] .pseries_memory_notifier+0x27c/0x290
[<c0000000008ae6cc>] .notifier_call_chain+0x8c/0x100
[<c0000000000d858c>] .__blocking_notifier_call_chain+0x6c/0xe0
[<c00000000071ddec>] .of_property_notify+0x7c/0xc0
[<c00000000071ed3c>] .of_update_property+0x3c/0x1b0
[<c0000000000756cc>] .ofdt_write+0x3dc/0x740
[<c0000000002f60fc>] .proc_reg_write+0xac/0x110
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c
This patch uses lock_device_hotplug() to protect remove_memory() called
in pseries_remove_memblock(), which is also stated before function
remove_memory():
* NOTE: The caller must call lock_device_hotplug() to serialize hotplug
* and online/offline operations before this call, as required by
* try_offline_node().
*/
void __ref remove_memory(int nid, u64 start, u64 size)
With this lock held, the other process(#1120 above) trying to online the
memory block will retry the system call when calling
lock_device_hotplug_sysfs(), and finally find No such device error.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
module_init should return 0 or a negative errno.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Bump the boot wrapper BOOT_COMMAND_LINE_SIZE to match the
kernel.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
I've had a report that the current limit is too small for
an automated network based installer. Bump it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have two definitions of COMMAND_LINE_SIZE, one for the kernel
and one for the boot wrapper. I assume this is so the boot
wrapper can be self sufficient and not rely on kernel headers.
Having two defines with the same name is confusing, I just
updated the wrong one when trying to bump it.
Make the boot wrapper define unique by calling it
BOOT_COMMAND_LINE_SIZE.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The catalog version number was changed from a be32 (with proceeding
32bits of padding) to a be64, update the code to treat it as a be64
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
fixup for "powerpc/perf: Add support for the hv gpci (get performance
counter info) interface".
Makes the "not enabled" message less awful (and hidden unless
debugging).
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
fixup for "powerpc/perf: Add support for the hv 24x7 interface"
Makes the "not enabled" message less awful (and hides it in most cases).
Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The if condition check was based on a draft ISA doc. Remove the same.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have two copies of code that creates an OPAL sg list. Consolidate
these into a common set of helpers and fix the endian issues.
The flash interface embedded a version number in the num_entries
field, whereas the dump interface did did not. Since versioning
wasn't added to the flash interface and it is impossible to add
this in a backwards compatible way, just remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fix little endian issues with the OPAL error log code.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The bitmap in opal_poll_events and opal_handle_interrupt is
big endian, so we need to byteswap it on little endian builds.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We had some duplication of the internal OPAL functions.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Using size_t in our APIs is asking for trouble, especially
when some OPAL calls use size_t pointers.
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On PowerNV platform, we are holding an unnecessary refcount on a pci_dev, which
leads to the pci_dev is not destroyed when hotplugging a pci device.
This patch release the unnecessary refcount.
Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With this patch I was able to update firmware on an LE kernel.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We have a subtle race when sending CPUs back to OPAL on kexec.
We mark them as "in real mode" right before we send them down. Once
we've booted the new kernel, it might try to call opal_reinit_cpus()
to change endianness, and that requires all CPUs to be spinning inside
OPAL.
However there is no synchronization here and we've observed cases
where the returning CPUs hadn't established their new state inside
OPAL before opal_reinit_cpus() is called, causing it to fail.
The proper fix is to actually wait for them to go down all the way
from the kexec'ing kernel.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The size of the sysparam sysfs files is determined from the device tree
at boot. However the buffer is hard coded to 64 bytes. If we encounter a
parameter that is larger than 64, or miss-parse the device tree, the
buffer will overflow when reading or writing to the parameter.
Check it at discovery time, and if the parameter is too large, do not
create a sysfs entry for it.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The sysparam code currently uses the userspace supplied number of
bytes when memcpy()ing in to a local 64-byte buffer.
Limit the maximum number of bytes by the size of the buffer.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The OPAL calls are returning int64_t values, which the sysparam code
stores in an int, and the sysfs callback returns ssize_t. Make code a
easier to read by consistently using ssize_t.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When a sysparam query in OPAL returned a negative value (error code),
sysfs would spew out a decent chunk of memory; almost 64K more than
expected. This was traced to a sign/unsigned mix up in the OPAL sysparam
sysfs code at sys_param_show.
The return value of sys_param_show is a ssize_t, calculated using
return ret ? ret : attr->param_size;
Alan Modra explains:
"attr->param_size" is an unsigned int, "ret" an int, so the overall
expression has type unsigned int. Result is that ret is cast to
unsigned int before being cast to ssize_t.
Instead of using the ternary operator, set ret to the param_size if an
error is not detected. The same bug exists in the sysfs write callback;
this patch fixes it in the same way.
A note on debugging this next time: on my system gcc will warn about
this if compiled with -Wsign-compare, which is not enabled by -Wall,
only -Wextra.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
commit 41dd03a9 may cause Oops in rtas_stop_self().
The reason is that the rtas_args was moved into stack space. For a box
with more that 4GB RAM, the stack could easily be outside 32bit range,
but RTAS is 32bit.
So the patch moves rtas_args away from stack by adding static before
it.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit aac416fc38 (lkdtm: flush icache and report actions) calls
flush_icache_range from a module. It's exported on most architectures
that implement it, but not on powerpc. This patch exports it to fix
the module link failure.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The asm-generic, big-endian version of zero_bytemask creates a mask of
bytes preceding the first zero-byte by left shifting ~0ul based on the
position of the first zero byte.
Unfortunately, if the first (top) byte is zero, the output of
prep_zero_mask has only the top bit set, resulting in undefined C
behaviour as we shift left by an amount equal to the width of the type.
As it happens, GCC doesn't manage to spot this through the call to fls(),
but the issue remains if architectures choose to implement their shift
instructions differently.
An example would be arch/arm/ (AArch32), where LSL Rd, Rn, #32 results
in Rd == 0x0, whilst on arch/arm64 (AArch64) LSL Xd, Xn, #64 results in
Xd == Xn.
Rather than check explicitly for the problematic shift, this patch adds
an extra shift by 1, replacing fls with __fls. Since zero_bytemask is
never called with a zero argument (has_zero() is used to check the data
first), we don't need to worry about calling __fls(0), which is
undefined.
Cc: <stable@vger.kernel.org>
Cc: Victor Kamensky <victor.kamensky@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This merges the patch to fix possible loss of dirty bit on munmap() or
madvice(DONTNEED). If there are concurrent writers on other CPU's that
have the unmapped/unneeded page in their TLBs, their writes to the page
could possibly get lost if a third CPU raced with the TLB flush and did
a page_mkclean() before the page was fully written.
Admittedly, if you unmap() or madvice(DONTNEED) an area _while_ another
thread is still busy writing to it, you deserve all the lost writes you
could get. But we kernel people hold ourselves to higher quality
standards than "crazy people deserve to lose", because, well, we've seen
people do all kinds of crazy things.
So let's get it right, just because we can, and we don't have to worry
about it.
* safe-dirty-tlb-flush:
mm: split 'tlb_flush_mmu()' into tlb flushing and memory freeing parts
Pull btrfs fixes from Chris Mason.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: limit the path size in send to PATH_MAX
Btrfs: correctly set profile flags on seqlock retry
Btrfs: use correct key when repeating search for extent item
Btrfs: fix inode caching vs tree log
Btrfs: fix possible memory leaks in open_ctree()
Btrfs: avoid triggering bug_on() when we fail to start inode caching task
Btrfs: move btrfs_{set,clear}_and_info() to ctree.h
btrfs: replace error code from btrfs_drop_extents
btrfs: Change the hole range to a more accurate value.
btrfs: fix use-after-free in mount_subvol()
Pull arm fixes from Russell King:
"A number of fixes for the PJ4/iwmmxt changes which arm-soc forced me
to take during the merge window. This stuff should have been better
tested and sorted out *before* the merge window"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8042/1: iwmmxt: allow to build iWMMXt on Marvell PJ4B
ARM: 8041/1: pj4: fix cpu_is_pj4 check
ARM: 8040/1: pj4: properly detect existence of iWMMXt coprocessor
ARM: 8039/1: pj4: enable iWMMXt only if CONFIG_IWMMXT is set
ARM: 8038/1: iwmmxt: explicitly check for supported architectures
Pull irq fixes from Thomas Gleixner:
"A slighlty large fix for a subtle issue in the CPU hotplug code of
certain ARM SoCs, where the not yet online cpu needs to setup the cpu
local timer and needs to set the interrupt affinity to itself.
Setting interrupt affinity to a not online cpu is prohibited and
therefor the timer interrupt ends up on the wrong cpu, which leads to
nasty complications.
The SoC folks tried to hack around that in the SoC code in some more
than nasty ways. The proper solution is to have a way to enforce the
affinity setting to a not online cpu. The core patch to the genirq
code provides that facility and the follow up patches make use of it
in the GIC interrupt controller and the exynos timer driver.
The change to the core code has no implications to existing users,
except for the rename of the locked function and therefor the
necessary fixup in mips/cavium. Aside of that, no runtime impact is
possible, as none of the existing interrupt chips implements anything
which depends on the force argument of the irq_set_affinity()
callback"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Exynos_mct: Register clock event after request_irq()
clocksource: Exynos_mct: Use irq_force_affinity() in cpu bringup
irqchip: Gic: Support forced affinity setting
genirq: Allow forcing cpu affinity of interrupts