Pull kbuild changes from Michal Marek:
- fix make -s detection with make-4.0
- fix for scripts/setlocalversion when the kernel repository is a
submodule
- do not hardcode ';' in macros that expand to assembler code, as some
architectures' assemblers use a different character for newline
- Fix passing --gdwarf-2 to the assembler
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
frv: Remove redundant debugging info flag
mn10300: Remove redundant debugging info flag
kbuild: Fix debugging info generation for .S files
arch: use ASM_NL instead of ';' for assembler new line character in the macro
kbuild: Fix silent builds with make-4
Fix detectition of kernel git repository in setlocalversion script [take #2]
For some assemblers, they use another character as newline in a macro
(e.g. arc uses '`'), so for generic assembly code, need use ASM_NL (a
macro) instead of ';' for it.
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Pull networking updates from David Miller:
1) BPF debugger and asm tool by Daniel Borkmann.
2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.
3) Correct reciprocal_divide and update users, from Hannes Frederic
Sowa and Daniel Borkmann.
4) Currently we only have a "set" operation for the hw timestamp socket
ioctl, add a "get" operation to match. From Ben Hutchings.
5) Add better trace events for debugging driver datapath problems, also
from Ben Hutchings.
6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we
have a small send and a previous packet is already in the qdisc or
device queue, defer until TX completion or we get more data.
7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.
8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
Borkmann.
9) Share IP header compression code between Bluetooth and IEEE802154
layers, from Jukka Rissanen.
10) Fix ipv6 router reachability probing, from Jiri Benc.
11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.
12) Support tunneling in GRO layer, from Jerry Chu.
13) Allow bonding to be configured fully using netlink, from Scott
Feldman.
14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
already get the TCI. From Atzm Watanabe.
15) New "Heavy Hitter" qdisc, from Terry Lam.
16) Significantly improve the IPSEC support in pktgen, from Fan Du.
17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom
Herbert.
18) Add Proportional Integral Enhanced packet scheduler, from Vijay
Subramanian.
19) Allow openvswitch to mmap'd netlink, from Thomas Graf.
20) Key TCP metrics blobs also by source address, not just destination
address. From Christoph Paasch.
21) Support 10G in generic phylib. From Andy Fleming.
22) Try to short-circuit GRO flow compares using device provided RX
hash, if provided. From Tom Herbert.
The wireless and netfilter folks have been busy little bees too.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
net/cxgb4: Fix referencing freed adapter
ipv6: reallocate addrconf router for ipv6 address when lo device up
fib_frontend: fix possible NULL pointer dereference
rtnetlink: remove IFLA_BOND_SLAVE definition
rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
qlcnic: update version to 5.3.55
qlcnic: Enhance logic to calculate msix vectors.
qlcnic: Refactor interrupt coalescing code for all adapters.
qlcnic: Update poll controller code path
qlcnic: Interrupt code cleanup
qlcnic: Enhance Tx timeout debugging.
qlcnic: Use bool for rx_mac_learn.
bonding: fix u64 division
rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
sfc: Use the correct maximum TX DMA ring size for SFC9100
Add Shradha Shah as the sfc driver maintainer.
net/vxlan: Share RX skb de-marking and checksum checks with ovs
tulip: cleanup by using ARRAY_SIZE()
ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
net/cxgb4: Don't retrieve stats during recovery
...
Remove an outdated reference to "most personal computers" having only one
CPU, and change the use of "singleprocessor" and "single processor" in
CONFIG_SMP's documentation to "uniprocessor" across all arches where that
documentation is present.
Signed-off-by: Robert Graffham <psquid@psquid.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Moved cmdline copy from asm to "C" - allows for more robust checking
of pointer validity etc.
* Remove the Kconfig option to do so, base it on a runtime value passed
by u-boot
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Checkin:
93ea02bb84 arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h
... unfortunately left some Kbuild files out of order, which caused
unnecessary merge conflicts, in particular with checkin:
e3fec2f74f lib: Add missing arch generic-y entries for asm-generic/hash.h
Put them back in order to make the upcoming merges cleaner.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20140114164420.d296fbcc4be3a5f126c86069@canb.auug.org.au
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJS0miqAAoJEHm+PkMAQRiGbfgIAJSWEfo8ludknhPcHJabBtxu
75SQAKJlL3sBVnxEc58Rtt8gsKYQIrm4IY5Slunklsn04RxuDUIQMgFoAYR5gQwz
+Myqkw/HOqDe5VStGxtLYpWnfglxVwGDCd7ISfL9AOVy5adMWBxh4Tv+qqQc7aIZ
eF7dy+DD+C6Q3Z5OoV8s0FZDxse29vOf17Nki7+7t8WMqyegYwjoOqNeqocGKsPi
eHLrJgTl4T6jB4l9LKKC154DSKjKOTSwZMWgwK8mToyNLT/ufCiKgXloIjEvZZcY
VVKUtncdHiTf+iqVojgpGBzOEeB5DM83iiapFeDiJg8C9yBzvT8lBtA9aPb5Wgw=
=lEeV
-----END PGP SIGNATURE-----
Merge tag 'v3.13-rc8' into core/locking
Refresh the tree with the latest fixes, before applying new changes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We're going to be adding a few new barrier primitives, and in order to
avoid endless duplication make more agressive use of
asm-generic/barrier.h.
Change the asm-generic/barrier.h such that it allows partial barrier
definitions and fills out the rest with defaults.
There are a few architectures (m32r, m68k) that could probably
do away with their barrier.h file entirely but are kept for now due to
their unconventional nop() implementation.
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131213150640.846368594@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Move the barriers functions that depend on the atomic implementation
into the atomic implementation.
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [for arch/arc bits]
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20131213150640.786183683@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
net/ipv6/ip6_tunnel.c
net/ipv6/ip6_vti.c
ipv6 tunnel statistic bug fixes conflicting with consolidation into
generic sw per-cpu net stats.
qlogic conflict between queue counting bug fix and the addition
of multiple MAC address support.
Signed-off-by: David S. Miller <davem@davemloft.net>
* Don't send an IPI if receiver already has a pending IPI.
Atomically piggyback the new msg with pending msg.
* IPI receiver looping on xchg() not required
References: https://lkml.org/lkml/2013/11/25/232
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The interface is confusing, it feels like we are getting "sender" info,
whereas it is the "receiver", which can very well be retrived by
smp_processor_id(), if need be.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The current IPI sending callstack needlessly involves cpumask.
arch_send_call_function_single_ipi(cpu) / smp_send_reschedule(cpu)
ipi_send_msg(cpumask_of(cpu)) --> [cpu to cpumask]
plat_smp_ops.ipi_send(callmap)
for_each_cpu(callmap) --> [cpuask to cpu]
do_plat_specific_ipi_PER_CPU
Given that current backends are not capable of 1:N IPIs, lets simplify
the interface for now, by keeping "a" cpu all along.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Commit 97bc386fc1 "ARC: Add guard macro to uapi/asm/unistd.h"
inhibited multiple inclusion of ARCH unistd.h. This however hosed the system
since Generic syscall table generator relies on it being included twice,
and in lack-of an empty table was emitted by C preprocessor.
Fix that by allowing one exception to rule for the special case (just
like Xtensa)
Suggested-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Conflicts:
drivers/net/ethernet/intel/i40e/i40e_main.c
drivers/net/macvtap.c
Both minor merge hassles, simple overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Chen originally proposed this as "right thing to do" however I
actually ran into this when building perf tools. Some of the utils
include unistd.h as well as linux/unistd.h. Since -I includes kernel
headers too, we end up including the ARC unistd.h twice, leading to
redefinition nwarnings.
------------------>8-------------------
CC bench/sched-pipe.o
In file included from ~/kernel/arch/arc/include/uapi/asm/unistd.h:21:0,
from ~/kernel/include/uapi/linux/unistd.h:7,
from bench/sched-pipe.c:24:
~/kernel/include/uapi/asm-generic/unistd.h:889:0: error: "__NR_fcntl64"
redefined [-Werror]
#define __NR_fcntl64 __NR3264_fcntl
^
In file included from
~/gnu/arc-linux-uclibc/sys-include/sys/syscall.h:24:0,
from bench/../perf.h:112,
from bench/sched-pipe.c:13:
~/gnu/arc-linux-uclibc/include/bits/sysnum.h:761:0: note: this is the
location of the previous definition
------------------>8-------------------
Verified that make headers_install works fine with this.
Suggested-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: David Howells <dhowells@redhat.com>
* Support for Perf from Mischa
* Enabling GPIO/Pinctrl drivers for Abilis TB10x platform
* New defconfig for buildroot
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJSihYhAAoJEGnX8d3iisJek9AP/3P/jPdDLOs/keuWHYUFkdI7
j4DOh7gb3hBibIGj4LXvXcZbJdaMakctiVuesyVCybdqMfY1SFCYw3QG5P1az6Yc
UP/h4dQrpxVGmD6pEBCBeqxZ+iL3jpJm38mwQ7/9tRgXMpU8kP5FbgAnRJ87PlX/
yujb5Hek69S/l/DZGxTgO9Ad7JQMWFjoegFgkG/rthhBF2RCLCOJZUWj3JMZArE9
Ab8Romrme/kbk2Ub22GQLhn9ks+SdDf1SkBbLh4tmRRasyCiD9cJDNtH4joGjin5
oOPWDU8fJCc+pf/DuAFDzvSap5T/JW6sUzOpUEUbwFXC7Ayl4qGDi9rfR90LV6NV
A2pqxpOPEL+X1HuEerrM14bwTMxWIZdECC3T7siSJFnvw75I78+gZ/WtqqerSdbl
YaKitCYzpvZP9iknVRfZ5/Bdpl0qNfAzQJDDg8GxcKPRUiJstKD4Rzp1F6XSXa1a
aHv+i8diS6kPhJSyYcQrNH5L2xWSe0OKR9Yblc6+J7TA9Hctg6LVNUAH1sMn2P8B
8EpC0LU6I0qLXBVr4A/uMqsAAE5QBANqSCh9Xl03MFS/4Ti/abQXOTrt0d7Jma2+
PAKQVImGV9O+7HePVru5fbarz8jg7td+8Bk7Zh362iBD99M1h2mS2X6FbytM4UO2
rjg5zl91IdT8u/hOLchz
=zVRP
-----END PGP SIGNATURE-----
Merge tag 'arc-v3.13-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull second set of ARC changes from Vineet Gupta:
- Support for Perf from Mischa
- Enabling GPIO/Pinctrl drivers for Abilis TB10x platform
- New defconfig for buildroot
* tag 'arc-v3.13-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [plat-arcfpga] Add defconfig without initramfs location
ARC: perf: ARC 700 PMU doesn't support sampling events
ARC: Add documentation on DT binding for ARC700 PMU
ARC: Add perf support for ARC700 cores
ARC: [TB10x] Updates for GPIO and pinctrl
Pull irq cleanups from Ingo Molnar:
"This is a multi-arch cleanup series from Thomas Gleixner, which we
kept to near the end of the merge window, to not interfere with
architecture updates.
This series (motivated by the -rt kernel) unifies more aspects of IRQ
handling and generalizes PREEMPT_ACTIVE"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
preempt: Make PREEMPT_ACTIVE generic
sparc: Use preempt_schedule_irq
ia64: Use preempt_schedule_irq
m32r: Use preempt_schedule_irq
hardirq: Make hardirq bits generic
m68k: Simplify low level interrupt handling code
genirq: Prevent spurious detection for unconditionally polled interrupts
Pull trivial tree updates from Jiri Kosina:
"Usual earth-shaking, news-breaking, rocket science pile from
trivial.git"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
doc: add missing files to timers/00-INDEX
timekeeping: Fix some trivial typos in comments
mm: Fix some trivial typos in comments
irq: Fix some trivial typos in comments
NUMA: fix typos in Kconfig help text
mm: update 00-INDEX
doc: Documentation/DMA-attributes.txt fix typo
DRM: comment: `halve' -> `half'
Docs: Kconfig: `devlopers' -> `developers'
doc: typo on word accounting in kprobes.c in mutliple architectures
treewide: fix "usefull" typo
treewide: fix "distingush" typo
mm/Kconfig: Grammar s/an/a/
kexec: Typo s/the/then/
Documentation/kvm: Update cpuid documentation for steal time and pv eoi
treewide: Fix common typo in "identify"
__page_to_pfn: Fix typo in comment
Correct some typos for word frequency
clk: fixed-factor: Fix a trivial typo
...
This is useful if you want to build a kernel without a ramfs, or
if you want to test the kernel build without having an initramfs
available at the hardcoded location.
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The ARC 700 does not have an interrupt associated with it, and as
such it cannot trigger when a counter overflows. As the counters are
48 bit, it will usually take at least 100 days before a counter
overflows, so for mere counting of events, there is no problem.
Sampling is not supported though.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
We've switched over every architecture that supports SMP to it, so
remove the new useless config variable.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
usual for this cycle with lots of clean-up.
- Cross arch clean-up and consolidation of early DT scanning code.
- Clean-up and removal of arch prom.h headers. Makes arch specific
prom.h optional on all but Sparc.
- Addition of interrupts-extended property for devices connected to
multiple interrupt controllers.
- Refactoring of DT interrupt parsing code in preparation for deferred
probe of interrupts.
- ARM cpu and cpu topology bindings documentation.
- Various DT vendor binding documentation updates.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJSgPQ4AAoJEMhvYp4jgsXif28H/1WkrXq5+lCFQZF8nbYdE2h0
R8PsfiJJmAl6/wFgQTsRel+ScMk2hiP08uTyqf2RLnB1v87gCF7MKVaLOdONfUDi
huXbcQGWCmZv0tbBIklxJe3+X3FIJch4gnyUvPudD1m8a0R0LxWXH/NhdTSFyB20
PNjhN/IzoN40X1PSAhfB5ndWnoxXBoehV/IVHVDU42vkPVbVTyGAw5qJzHW8CLyN
2oGTOalOO4ffQ7dIkBEQfj0mrgGcODToPdDvUQyyGZjYK2FY2sGrjyquir6SDcNa
Q4gwatHTu0ygXpyphjtQf5tc3ZCejJ/F0s3olOAS1ahKGfe01fehtwPRROQnCK8=
=GCbY
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"DeviceTree updates for 3.13. This is a bit larger pull request than
usual for this cycle with lots of clean-up.
- Cross arch clean-up and consolidation of early DT scanning code.
- Clean-up and removal of arch prom.h headers. Makes arch specific
prom.h optional on all but Sparc.
- Addition of interrupts-extended property for devices connected to
multiple interrupt controllers.
- Refactoring of DT interrupt parsing code in preparation for
deferred probe of interrupts.
- ARM cpu and cpu topology bindings documentation.
- Various DT vendor binding documentation updates"
* tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (82 commits)
powerpc: add missing explicit OF includes for ppc
dt/irq: add empty of_irq_count for !OF_IRQ
dt: disable self-tests for !OF_IRQ
of: irq: Fix interrupt-map entry matching
MIPS: Netlogic: replace early_init_devtree() call
of: Add Panasonic Corporation vendor prefix
of: Add Chunghwa Picture Tubes Ltd. vendor prefix
of: Add AU Optronics Corporation vendor prefix
of/irq: Fix potential buffer overflow
of/irq: Fix bug in interrupt parsing refactor.
of: set dma_mask to point to coherent_dma_mask
of: add vendor prefix for PHYTEC Messtechnik GmbH
DT: sort vendor-prefixes.txt
of: Add vendor prefix for Cadence
of: Add empty for_each_available_child_of_node() macro definition
arm/versatile: Fix versatile irq specifications.
of/irq: create interrupts-extended property
microblaze/pci: Drop PowerPC-ism from irq parsing
of/irq: Create of_irq_parse_and_map_pci() to consolidate arch code.
of/irq: Use irq_of_parse_and_map()
...
This adds basic perf support for ARC700 cores. Most PERF_COUNT_HW* events
are supported now.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Pull scheduler changes from Ingo Molnar:
"The main changes in this cycle are:
- (much) improved CONFIG_NUMA_BALANCING support from Mel Gorman, Rik
van Riel, Peter Zijlstra et al. Yay!
- optimize preemption counter handling: merge the NEED_RESCHED flag
into the preempt_count variable, by Peter Zijlstra.
- wait.h fixes and code reorganization from Peter Zijlstra
- cfs_bandwidth fixes from Ben Segall
- SMP load-balancer cleanups from Peter Zijstra
- idle balancer improvements from Jason Low
- other fixes and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHED
stop_machine: Fix race between stop_two_cpus() and stop_cpus()
sched: Remove unnecessary iteration over sched domains to update nr_busy_cpus
sched: Fix asymmetric scheduling for POWER7
sched: Move completion code from core.c to completion.c
sched: Move wait code from core.c to wait.c
sched: Move wait.c into kernel/sched/
sched/wait: Fix __wait_event_interruptible_lock_irq_timeout()
sched: Avoid throttle_cfs_rq() racing with period_timer stopping
sched: Guarantee new group-entities always have weight
sched: Fix hrtimer_cancel()/rq->lock deadlock
sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining
sched: Fix race on toggling cfs_bandwidth_used
sched: Remove extra put_online_cpus() inside sched_setaffinity()
sched/rt: Fix task_tick_rt() comment
sched/wait: Fix build breakage
sched/wait: Introduce prepare_to_wait_event()
sched/wait: Add ___wait_cond_timeout() to wait_event*_timeout() too
sched: Remove get_online_cpus() usage
sched: Fix race in migrate_swap_stop()
...
Device tree and Kconfig updates for GPIO and pinctrl drivers.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Commit 9a46ad6d6d "smp: make smp_call_function_many() use logic
similar to smp_call_function_single()" has unified the way to handle
single and multiple cross-CPU function calls. Now only one interrupt
is needed for architecture specific code to support generic SMP function
call interfaces, so kill the redundant single function call interrupt.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Jiang Liu <liuj97@gmail.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ST.as only takes S9 (255) for offset. This was going out of range when
accessing a task_struct field with 4k NR_CPUS (due to 128b of coumaks
itself in there).
Workaround by using an intermediate register to do the address scaling.
There is some duplication of fix for ctx_sw.c and ctx_sw_asm.S however
given that C version will go away soon I'm not bothering to factor out
the common code.
Reported-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
- Add mm_cpumask setting (aggregating only, unlike some other arches)
used to restrict the TLB flush cross-calling
- cross-calling versions of TLB flush routines (thanks to Noam)
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Need export symbol for it, or can not pass compiling, the related error
with allmodconfig:
MODPOST 2994 modules
ERROR: "pm_power_off" [drivers/mfd/retu-mfd.ko] undefined!
ERROR: "pm_power_off" [drivers/char/ipmi/ipmi_poweroff.ko] undefined!
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Need export its symbol just like other architectures done, or can not
pass compiling with allmodconfig, the related error:
MODPOST 2994 modules
ERROR: "save_stack_trace" [kernel/backtracetest.ko] undefined!
ERROR: "save_stack_trace" [drivers/md/persistent-data/dm-persistent-data.ko] undefined!
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
get_hw_config_num_irq() may be called by normal iss_model_init_smp()
which is a function pointer for 'init_smp' which may be called by
first_lines_of_secondary() which also need be normal too.
The related warning (with allmodconfig):
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x5814): Section mismatch in reference from the function iss_model_init_smp() to the function .init.text:get_hw_config_num_irq()
The function iss_model_init_smp() references
the function __init get_hw_config_num_irq().
This is often because iss_model_init_smp lacks a __init
annotation or the annotation of get_hw_config_num_irq is wrong.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
first_lines_of_secondary() is a '__init' function, but it may be called
by __cpu_up() by _cpu_up() by cpu_up() which is a normal export symbol
function. So recommend to remove '__init'.
The related warning (with allmodconfig):
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x315c): Section mismatch in reference from the function __cpu_up() to the function .init.text:first_lines_of_secondary()
The function __cpu_up() references
the function __init first_lines_of_secondary().
This is often because __cpu_up lacks a __init
annotation or the annotation of first_lines_of_secondary is wrong.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
They haven't '__init' in definition, but has '__init' in declaration.
And normal function start_kernel_secondary() may call setup_processor()
which will call arc_init_IRQ().
So need remove '__init' for both of them. The related warning (with
allmodconfig):
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x3084): Section mismatch in reference from the function start_kernel_secondary() to the function .init.text:setup_processor()
The function start_kernel_secondary() references
the function __init setup_processor().
This is often because start_kernel_secondary lacks a __init
annotation or the annotation of setup_processor is wrong.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
arc supports kgdb, but need update -- add function kgdb_roundup_cpus(),
or can not pass compiling. At present, add the simple generic one just
like other architectures(e.g. tile, mips ...).
The related error (with allmodconfig):
kernel/built-in.o: In function `kgdb_cpu_enter':
kernel/debug/debug_core.c:580: undefined reference to `kgdb_roundup_cpus'
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
------------------>8----------------------
arch/arc/mm/tlb.c: In function ‘do_tlb_overlap_fault’:
arch/arc/mm/tlb.c:688:13: warning: array subscript is above array bounds
[-Warray-bounds]
(pd0[n] & PAGE_MASK)) {
^
------------------>8----------------------
While at it, remove the usless last iteration of outer loop when reading
a TLB SET for duplicate entries.
Suggested-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Lockdep required a small fix to stacktrace API which was incorrectly
unwindign out of __switch_to for the current call frame.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
In case bootloader has changed the priority of one/more IRQ lines
Reported-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
switch the args (address, pt_regs) to match with all the other "C"
exception handlers.
This removes the awkwardness in EV_ProtV for page fault vs. unaligned
access.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Line op needs vaddr (indexing) and paddr (tag match). For page sized
flushes (V-P const), each line op will need a different index, but the
tag bits wil remain constant, hence paddr can be setup once outside the
loop.
This improves select LMBench numbers for Aliasing dcache where we have
more "preventive" cache flushing.
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.11-rc7- Linux 3.11.0- 80 4.66 8.88 69.7 112. 268. 8.60 28.0 3489 13.K 27.K # Non alias ARC700
3.11-rc7- Linux 3.11.0- 80 4.64 8.51 68.6 98.5 271. 8.58 28.1 4160 15.K 32.K # Aliasing
3.11-rc7- Linux 3.11.0- 80 4.64 8.51 69.8 99.4 270. 8.73 27.5 3880 15.K 31.K # PTAG loop Inv
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
With Line length being constant now, we can fold the 2 helpers into 1.
This allows applying any optimizations (forthcoming) to single place.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC dcache supports 3 ops - Inv, Flush, Flush-n-Inv.
The programming model however provides 2 commands FLUSH, INV.
INV will either discard or flush-n-discard (based on DT_CTRL bit)
The leaf helper __dc_line_loop() used to take the AUX register
(corresponding to the 2 commands). Now we push that to within the
helper, paving way for code consolidations to follow.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
__get_cpu_var() is used for multiple purposes in the kernel source. One of them is
address calculation via the form &__get_cpu_var(x). This calculates the address for
the instance of the percpu variable of the current processor based on an offset.
Other use cases are for storing and retrieving data from the current processors percpu area.
__get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment.
__get_cpu_var() is defined as :
#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
__get_cpu_var() always only does an address determination. However, store and retrieve operations
could use a segment prefix (or global register on other platforms) to avoid the address calculation.
this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use
optimized assembly code to read and write per cpu variables.
This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr()
or into a use of this_cpu operations that use the offset. Thereby address calcualtions are avoided
and less registers are used when code is generated.
At the end of the patchset all uses of __get_cpu_var have been removed so the macro is removed too.
The patchset includes passes over all arches as well. Once these operations are used throughout then
specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by
f.e. using a global register that may be set to the per cpu base.
Transformations done to __get_cpu_var()
1. Determine the address of the percpu instance of the current processor.
DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(&y);
2. Same as #1 but this time an array structure is involved.
DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(y);
3. Retrieve the content of the current processors instance of a per cpu variable.
DEFINE_PER_CPU(int, u);
int x = __get_cpu_var(y)
Converts to
int x = __this_cpu_read(y);
4. Retrieve the content of a percpu struct
DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);
Converts to
memcpy(this_cpu_ptr(&x), y, sizeof(x));
5. Assignment to a per cpu variable
DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;
Converts to
this_cpu_write(y, x);
6. Increment/Decrement etc of a per cpu variable
DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++
Converts to
this_cpu_inc(y)
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
A vmalloc fault needs to sync up PGD/PTE entry from init_mm to current
task's "active_mm". ARC vmalloc fault handler however was using mm.
A vmalloc fault for non user task context (actually pre-userland, from
init thread's open for /dev/console) caused the handler to deref NULL mm
(for mm->pgd)
The reasons it worked so far is amazing:
1. By default (!SMP), vmalloc fault handler uses a cached value of PGD.
In SMP that MMU register is repurposed hence need for mm pointer deref.
2. In pre-3.12 SMP kernel, the problem triggering vmalloc didn't exist in
pre-userland code path - it was introduced with commit 20bafb3d23
"n_tty: Move buffers into n_tty_data"
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: stable@vger.kernel.org #3.10 and 3.11
Cc: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Resolve cherry-picking conflicts:
Conflicts:
mm/huge_memory.c
mm/memory.c
mm/mprotect.c
See this upstream merge commit for more details:
52469b4fcd Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Signed-off-by: Ingo Molnar <mingo@kernel.org>
ARCompact TRAP_S insn used for breakpoints, commits before exception is
taken (updating architectural PC). So ptregs->ret contains next-PC and
not the breakpoint PC itself. This is different from other restartable
exceptions such as TLB Miss where ptregs->ret has exact faulting PC.
gdb needs to know exact-PC hence ARC ptrace GETREGSET provides for
@stop_pc which returns ptregs->ret vs. EFA depending on the
situation.
However, writing stop_pc (SETREGSET request), which updates ptregs->ret
doesn't makes sense stop_pc doesn't always correspond to that reg as
described above.
This was not an issue so far since user_regs->ret / user_regs->stop_pc
had same value and both writing to ptregs->ret was OK, needless, but NOT
broken, hence not observed.
With gdb "jump", they diverge, and user_regs->ret updating ptregs is
overwritten immediately with stop_pc, which this patch fixes.
Reported-by: Anton Kolesov <akolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Now that prom.h is optional, all the empty prom.h headers can be removed.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
HAVE_ARCH_DEVTREE_FIXUPS appears to always be needed except for sparc,
but it is only used for /proc/device-teee and sparc does not enable
/proc/device-tree. So this option is redundant. Remove the option and
always enable it. This has the side effect of fixing /proc/device-tree
on arches such as arm64 which failed to define this option.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Remove unnecessary prom.h includes in preparation to remove prom.h.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Convert arc to use the common of_flat_dt_match_machine function.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
All arches do essentially the same thing now for
early_init_dt_setup_initrd_arch, so it can now be removed.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
Use the common unflatten_and_copy_device_tree to copy the built-in FDT
out of init section.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Grant Likely <grant.likely@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSUc9zAAoJEHm+PkMAQRiG9DMH/AtpuAF6LlMRPjrCeuJQ1pyh
T0IUO+CsLKO6qtM5IyweP8V6zaasNjIuW1+B6IwVIl8aOrM+M7CwRiKvpey26ldM
I8G2ron7hqSOSQqSQs20jN2yGAqQGpYIbTmpdGLAjQ350NNNvEKthbP5SZR5PAmE
UuIx5OGEkaOyZXvCZJXU9AZkCxbihlMSt2zFVxybq2pwnGezRUYgCigE81aeyE0I
QLwzzMVdkCxtZEpkdJMpLILAz22jN4RoVDbXRa2XC7dA9I2PEEXI9CcLzqCsx2Ii
8eYS+no2K5N2rrpER7JFUB2B/2X8FaVDE+aJBCkfbtwaYTV9UYLq3a/sKVpo1Cs=
=xSFJ
-----END PGP SIGNATURE-----
Merge tag 'v3.12-rc4' into sched/core
Merge Linux v3.12-rc4 to fix a conflict and also to refresh the tree
before applying more scheduler patches.
Conflicts:
arch/avr32/include/asm/Kbuild
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Previously, when a signal was registered with SA_SIGINFO, parameters 2
and 3 of the signal handler were written to registers r1 and r2 before
the register set was saved. This led to corruption of these two
registers after returning from the signal handler (the wrong values were
restored).
With this patch, registers are now saved before any parameters are
passed, thus maintaining the processor state from before signal entry.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
clockevents_config_and_register is more clever and correct than doing it
by hand; so use it.
[vgupta: fixed build failure due to missing ; in patch]
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
can only use atomic EX insn (reg with mem) to build higher level R-M-W
primitives. This includes a SystemC based SMP simulation model.
So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
operation to update reader(s)/writer count.
The spinlock operation itself looks as follows:
mov reg, 1 ; 1=locked, 0=unlocked
retry:
EX reg, [lock] ; load existing, store 1, atomically
BREQ reg, 1, rety ; if already locked, retry
In single-threaded simulation, SystemC alternates between the 2 cores
with "N" insn each based scheduling. Additionally for insn with global
side effect, such as EX writing to shared mem, a core switch is
enforced too.
Given that, 2 cores doing a repeated EX on same location, Linux often
got into a livelock e.g. when both cores were fiddling with tasklist
lock (gdbserver / hackbench) for read/write respectively as the
sequence diagram below shows:
core1 core2
-------- --------
1. spin lock [EX r=0, w=1] - LOCKED
2. rwlock(Read) - LOCKED
3. spin unlock [ST 0] - UNLOCKED
spin lock [EX r=0,w=1] - LOCKED
-- resched core 1----
5. spin lock [EX r=1] - ALREADY-LOCKED
-- resched core 2----
6. rwlock(Write) - READER-LOCKED
7. spin unlock [ST 0]
8. rwlock failed, retry again
9. spin lock [EX r=0, w=1]
-- resched core 1----
10 spinlock locked in #9, retry #5
11. spin lock [EX gets 1]
-- resched core 2----
...
...
The fix was to unlock using the EX insn too (step 7), to trigger another
SystemC scheduling pass which would let core1 proceed, eliding the
livelock.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Anton reported
| LTP tests syscalls/process_vm_readv01 and process_vm_writev01 fail
| similarly in one testcase test_iov_invalid -> lvec->iov_base.
| Testcase expects errno EFAULT and return code -1,
| but it gets return code 1 and ERRNO is 0 what means success.
Essentially test case was passing a pointer of -1 which access_ok()
was not catching. It was doing [@addr + @sz <= TASK_SIZE] which would
pass for @addr == -1
Fixed that by rewriting as [@addr <= TASK_SIZE - @sz]
Reported-by: Anton Kolesov <Anton.Kolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
If a load or store is the last instruction in a zero-overhead-loop, and
it's misaligned, the loop would execute only once.
This fixes that problem.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
In order to prepare to per-arch implementations of preempt_count move
the required bits into an asm-generic header and use this for all
archs.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
After the last architecture switched to generic hard irqs the config
options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
for !CONFIG_GENERIC_HARDIRQS can be removed.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Merge more patches from Andrew Morton:
"The rest of MM. Plus one misc cleanup"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (35 commits)
mm/Kconfig: add MMU dependency for MIGRATION.
kernel: replace strict_strto*() with kstrto*()
mm, thp: count thp_fault_fallback anytime thp fault fails
thp: consolidate code between handle_mm_fault() and do_huge_pmd_anonymous_page()
thp: do_huge_pmd_anonymous_page() cleanup
thp: move maybe_pmd_mkwrite() out of mk_huge_pmd()
mm: cleanup add_to_page_cache_locked()
thp: account anon transparent huge pages into NR_ANON_PAGES
truncate: drop 'oldsize' truncate_pagecache() parameter
mm: make lru_add_drain_all() selective
memcg: document cgroup dirty/writeback memory statistics
memcg: add per cgroup writeback pages accounting
memcg: check for proper lock held in mem_cgroup_update_page_stat
memcg: remove MEMCG_NR_FILE_MAPPED
memcg: reduce function dereference
memcg: avoid overflow caused by PAGE_ALIGN
memcg: rename RESOURCE_MAX to RES_COUNTER_MAX
memcg: correct RESOURCE_MAX to ULLONG_MAX
mm: memcg: do not trap chargers with full callstack on OOM
mm: memcg: rework and document OOM waiting and wakeup
...
Unlike global OOM handling, memory cgroup code will invoke the OOM killer
in any OOM situation because it has no way of telling faults occuring in
kernel context - which could be handled more gracefully - from
user-triggered faults.
Pass a flag that identifies faults originating in user space from the
architecture-specific fault handlers to generic code so that memcg OOM
handling can be improved.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The memcg code can trap tasks in the context of the failing allocation
until an OOM situation is resolved. They can hold all kinds of locks
(fs, mm) at this point, which makes it prone to deadlocking.
This series converts memcg OOM handling into a two step process that is
started in the charge context, but any waiting is done after the fault
stack is fully unwound.
Patches 1-4 prepare architecture handlers to support the new memcg
requirements, but in doing so they also remove old cruft and unify
out-of-memory behavior across architectures.
Patch 5 disables the memcg OOM handling for syscalls, readahead, kernel
faults, because they can gracefully unwind the stack with -ENOMEM. OOM
handling is restricted to user triggered faults that have no other
option.
Patch 6 reworks memcg's hierarchical OOM locking to make it a little
more obvious wth is going on in there: reduce locked regions, rename
locking functions, reorder and document.
Patch 7 implements the two-part OOM handling such that tasks are never
trapped with the full charge stack in an OOM situation.
This patch:
Back before smart OOM killing, when faulting tasks were killed directly on
allocation failures, the arch-specific fault handlers needed special
protection for the init process.
Now that all fault handlers call into the generic OOM killer (see commit
609838cfed: "mm: invoke oom-killer from remaining unconverted page
fault handlers"), which already provides init protection, the
arch-specific leftovers can be removed.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: azurIt <azurit@pobox.sk>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 05b016ecf5 "ARC: Setup Vector Table Base in early boot" moved
the Interrupt vector Table setup out of arc_init_IRQ() which is called
for all CPUs, to entry point of boot cpu only, breaking booting of others.
Fix by adding the same to entry point of non-boot CPUs too.
read_arc_build_cfg_regs() printing IVT Base Register didn't help the
casue since it prints a synthetic value if zero which is totally bogus,
so fix that to print the exact Register.
[vgupta: Remove the now stale comment from header of arc_init_IRQ and
also added the commentary for halt-on-reset]
Cc: Gilad Ben-Yossef <gilad@benyossef.com>
Cc: Cc: <stable@vger.kernel.org> #3.11
Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Generally minor changes. A bunch of bug fixes, particularly for
initialization and some refactoring. Most notable change if feeding the
entire flattened tree into the random pool at boot. May not be
significant, but shouldn't hurt either.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJSL12LAAoJEEFnBt12D9kB64gP/RBipnYbo3RPanHg+lE/J1V7
KSVFNGKWJHxTg47VVC1YJGIG21jqxAilpdS2MQL5FP7iyd+IzvtHpQiJgp+2G+pq
di06yrdyrYErxRgZgGQi8IpR538ZzOEVLCKJGdb09YelkRzPT5au7CC1MAsX3qco
yba7PHk0/Nc4hZE4aGbgR1DlRmn86ob7mM0KFE/LORaSN2BueMgWcwKhQXYNGyoh
assX4yNhAbUG6Bgw7paBLDGqHh8c5Ei5AppU8yPb+N094jgYHBJryUoDlzzUHD23
qqiEqHhUKT0TpgHNs8KH0WZFugcmjKvYEbzdzadBxqfXnJN4fKSEcdfF3iz4T14j
U6EZks89GoHwA523OghUZkKNOqlsUdWfdKz+8/grQqKisYwDcf3fCxEYk/4weDCQ
b6fFlOv6+AI3btjXp6F511ZKxyT4ZZzkHjp/ZSrhBygyamNZfax0ma0j+ZS9AZql
kPxQS0nOve6NKaP7vXxMmW5sGMnL19ER/Hm31wthGcWI43GVebUdklnzfGaEeSjs
pmP8oiCNemceqVpiPKxcOxiguf/eyIjP1SFXbguASygUmQeTDbbJ8n1FYznCitue
xJgWttKWsEf/aMR3eJtQ3aBmHR3rijAV4E28Wlq8XMkocwvpQm2zMocS2Z5BJ80S
hi1kQVy8+RxNX96tOSp1
=GSWl
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux
Pull device tree core updates from Grant Likely:
"Generally minor changes. A bunch of bug fixes, particularly for
initialization and some refactoring. Most notable change if feeding
the entire flattened tree into the random pool at boot. May not be
significant, but shouldn't hurt either"
Tim Bird questions whether the boot time cost of the random feeding may
be noticeable. And "add_device_randomness()" is definitely not some
speed deamon of a function.
* tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
of/platform: add error reporting to of_amba_device_create()
irq/of: Fix comment typo for irq_of_parse_and_map
of: Feed entire flattened device tree into the random pool
of/fdt: Clean up casting in unflattening path
of/fdt: Remove duplicate memory clearing on FDT unflattening
gpio: implement gpio-ranges binding document fix
of: call __of_parse_phandle_with_args from of_parse_phandle
of: introduce of_parse_phandle_with_fixed_args
of: move of_parse_phandle()
of: move documentation of of_parse_phandle_with_args
of: Fix missing memory initialization on FDT unflattening
of: consolidate definition of early_init_dt_alloc_memory_arch()
of: Make of_get_phy_mode() return int i.s.o. const int
include: dt-binding: input: create a DT header defining key codes.
of/platform: Staticize of_platform_device_create_pdata()
of: Specify initrd location using 64-bit
dt: Typo fix
OF: make of_property_for_each_{u32|string}() use parameters if OF is not enabled
--------------->8--------------------
WARNING: vmlinux.o(.text+0x708): Section mismatch in reference from the
function read_arc_build_cfg_regs() to the function
.init.text:read_decode_cache_bcr()
WARNING: vmlinux.o(.text+0x702): Section mismatch in reference from the
function read_arc_build_cfg_regs() to the function
.init.text:read_decode_mmu_bcr()
--------------->8--------------------
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Cast usecs to u64, to ensure that the (usecs * 4295 * HZ)
multiplication is 64 bit.
Initially, the (usecs * 4295 * HZ) part was done as a 32 bit
multiplication, with the result casted to 64 bit. This led to some bits
falling off, causing a "DMA initialization error" in the stmmac Ethernet
driver, due to a premature timeout.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
It prevents kernel parameters such as 'loglevel' from doing their job.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Some drivers require these, and ARC didn't had them yet.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This helps remove asid-to-mm reverse map
While mm->context.id contains the ASID assigned to a process, our ASID
allocator also used asid_mm_map[] reverse map. In a new allocation
cycle (mm->ASID >= @asid_cache), the Round Robin ASID allocator used this
to check if new @asid_cache belonged to some mm2 (from prev cycle).
If so, it could locate that mm using the ASID reverse map, and mark that
mm as unallocated ASID, to force it to refresh at the time of switch_mm()
However, for SMP, the reverse map has to be maintained per CPU, so
becomes 2 dimensional, hence got rid of it.
With reverse map gone, it is NOT possible to reach out to current
assignee. So we track the ASID allocation generation/cycle and
on every switch_mm(), check if the current generation of CPU ASID is
same as mm's ASID; If not it is refreshed.
(Based loosely on arch/sh implementation)
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ASID allocation changes/2
Use the fact that switch_mm() and activate_mm() are exactly same code
now while acknowledging the semantical difference in comment
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ASID allocation changes/1
This patch does 2 things:
(1) get_new_mmu_context() NOW moves mm->ASID to a new value ONLY if it
was from a prev allocation cycle/generation OR if mm had no ASID
allocated (vs. before would unconditionally moving to a new ASID)
Callers desiring unconditional update of ASID, e.g.local_flush_tlb_mm()
(for parent's address space invalidation at fork) need to first force
the parent to an unallocated ASID.
(2) get_new_mmu_context() always sets the MMU PID reg with unchanged/new
ASID value.
The gains are:
- consolidation of all asid alloc logic into get_new_mmu_context()
- avoiding code duplication in switch_mm() for PID reg setting
- Enables future change to fold activate_mm() into switch_mm()
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-Asm code already has values of SW and HW ASID values, so they can be
passed to the printing routine.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This reorganizes the current TLB operations into psuedo-ops to better
pair with MMUv4's native Insert/Delete operations
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
With previous commit freeing up PTE bits, reassign them so as to:
- Match the bit to H/w counterpart where possible
(e.g. MMUv2 GLOBAL/PRESENT, this avoids a shift in create_tlb())
- Avoid holes in _PAGE_xxx definitions
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The current ARC VM code has 13 flags in Page Table entry: some software
(accesed/dirty/non-linear-maps) and rest hardware specific. With 8k MMU
page, we need 19 bits for addressing page frame so remaining 13 bits is
just about enough to accomodate the current flags.
In MMUv4 there are 2 additional flags, SZ (normal or super page) and WT
(cache access mode write-thru) - and additionally PFN is 20 bits (vs. 19
before for 8k). Thus these can't be held in current PTE w/o making each
entry 64bit wide.
It seems there is some scope of compressing the current PTE flags (and
freeing up a few bits). Currently PTE contains fully orthogonal distinct
access permissions for kernel and user mode (Kr, Kw, Kx; Ur, Uw, Ux)
which can be folded into one set (R, W, X). The translation of 3 PTE
bits into 6 TLB bits (when programming the MMU) can be done based on
following pre-requites/assumptions:
1. For kernel-mode-only translations (vmalloc: 0x7000_0000 to
0x7FFF_FFFF), PTE additionally has PAGE_GLOBAL flag set (and user
space entries can never be global). Thus such a PTE can translate
to Kr, Kw, Kx (as appropriate) and zero for User mode counterparts.
2. For non global entries, the PTE flags can be used to create mirrored
K and U TLB bits. This is true after commit a950549c67
"ARC: copy_(to|from)_user() to honor usermode-access permissions"
which ensured that user-space translations _MUST_ have same access
permissions for both U/K mode accesses so that copy_{to,from}_user()
play fair with fault based CoW break and such...
There is no such thing as free lunch - the cost is slightly infalted
TLB-Miss Handlers.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* reduce editor lines taken by pt_regs
* ARCompact ISA specific part of TLB Miss handlers clubbed together
* cleanup some comments
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Most architectures use the same implementation. Collapse the common ones
into a single weak function that can be overridden.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
In the exception return path, for both U/K cases, intr are already
disabled (for various existing reasons). So when we drop down to
@restore_regs, we need not redo that.
There was subtle issue - when intr were NOT being disabled for
ret-to-kernel-but-no-preemption case - now fixed by moving the
IRQ_DISABLE further up in @resume_kernel_mode.
So what do we gain:
* Shaves off a few insn in return path.
* Eliminates the need for IRQ_DISABLE_SAVE assembler macro for ARCv2
hence allows for entry code sharing.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
After the recent cleanups, all the exception handlers now have same
boilerplate prologue code. Move that into common macro.
This reduces readability but helps greatly with sharing / duplicating
entry code with ARCv2 ISA where the handlers are pretty much the same,
just the entry prologue is different (due to hardware assist).
Also while at it, add the missing FAKE_RET_FROM_EXCPN calls in couple of
places to drop down to pure kernel mode (from exception mode) before
jumping off into "C" code.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
On some PAE architectures, the entire range of physical memory could reside
outside the 32-bit limit. These systems need the ability to specify the
initrd location using 64-bit numbers.
This patch globally modifies the early_init_dt_setup_initrd_arch() function to
use 64-bit numbers instead of the current unsigned long.
There has been quite a bit of debate about whether to use u64 or phys_addr_t.
It was concluded to stick to u64 to be consistent with rest of the device
tree code. As summarized by Geert, "The address to load the initrd is decided
by the bootloader/user and set at that point later in time. The dtb should not
be tied to the kernel you are booting"
More details on the discussion can be found here:
https://lkml.org/lkml/2013/6/20/690https://lkml.org/lkml/2012/9/13/544
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
corresponding drivers (net/ethernet/arc/*, irqctl/irq-tb10x.c) have now
been merged into your tree.
Ideally these shd have been part of same submissions, oh well...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJR3OKpAAoJEGnX8d3iisJeB5EP/i4HlYQXrZWpT/6Bm7gdyzqA
hvA0oObRvGfgVbxG7R/9sKHU/FDx479pwrZbXPU4I+WM34QRnueXtAodBzrK7jQp
FdcBVN+YCcKUJh4Gxs+a5jtKj8v6u8+NaJc3L/S2+4vJlznoAY6CxnlQX3SK0SAC
d7irz9ZlP58q8eFHBJXa96HMIf7A2EACmwG3Uo6oDj15LVx+XW5i2o6rABvfUNDx
eKnSekoav1CRJyW4RJYI/hJdUM3vcbbRcz/2IyqSquWn8EqyZWY9iijdIwoUha3c
6geZ2YXi4rikU5kA/3cVuEQa68gj7gTaceEE7RT8y5H0DQjpgB07Tkyr9kFLpnRI
b+SGpQVA6FsLtDFNiRjsu2Ft/411UnRfhgv5d7SsuCqeVVPkCEd13Ttj4g49EJjw
Z3FWXTY0f/zYOSzU/6UQ6KZ/ZEvnnasLSCzDPrEK7WVu4lFuMz9BJglzSAbMMUQI
XZaWsm8ForzuYYUDFXQ79ORGsGKjhq3Pl1kF4CQ4TYjL75DwWgEAi9TI6ENuLH8+
+Ox4l385V51/YgcasawgpmGG1APgLtqw5ZFu6GUu6mytQs2huenaoT9+ov2FSu2N
B0o04nmm6h1rgSdiq7ZKDHv5lS1RXYYgoIDYYLqp5Lxk2JcMxvZYfMhbb/4KV+7/
y6w3ui2SNA5ttV7v41zW
=QgUU
-----END PGP SIGNATURE-----
Merge tag 'arc-v3.11-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull second set of ARC architecture updates from Vineet Gupta:
"Couple of Platform updates (Device Tree files primarily) given that
the corresponding drivers (net/ethernet/arc/*, irqctl/irq-tb10x.c)
have now been merged into your tree.
Ideally these shd have been part of same submissions, oh well..."
* tag 'arc-v3.11-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [TB10x] Updates for irqchip driver
ARC: [plat-arcfpga] Enable arc_emac for ARCAngle4 Board
A few remaining architectures directly kill the page faulting task in an
out of memory situation. This is usually not a good idea since that
task might not even use a significant amount of memory and so may not be
the optimal victim to resolve the situation.
Since 2.6.29's 1c0fe6e ("mm: invoke oom-killer from page fault") there
is a hook that architecture page fault handlers are supposed to call to
invoke the OOM killer and let it pick the right task to kill. Convert
the remaining architectures over to this hook.
To have the previous behavior of simply taking out the faulting task the
vm.oom_kill_allocating_task sysctl can be set to 1.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc bits]
Cc: James Hogan <james.hogan@imgtec.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull i2c updates from Wolfram Sang:
- new drivers: Kontron PLD, Wondermedia VT
- mv64xxx driver gained sun4i support and a bigger cleanup
- duplicate driver 'intel-mid' removed
- added generic device tree binding for sda holding time (and
designware driver already uses it)
- we tried to allow driver probing with only device tree and no i2c
ids, but I had to revert it because of side effects. Needs some
rethinking.
- driver bugfixes, cleanups...
* 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (34 commits)
i2c-designware: use div_u64 to fix link
i2c: Kontron PLD i2c bus driver
i2c: iop3xxx: fix build failure after waitqueue changes
i2c-designware: make SDA hold time configurable
i2c: mv64xxx: Set bus frequency to 100kHz if clock-frequency is not provided
i2c: imx: allow autoloading on dt ids
i2c: mv64xxx: Fix transfer error code
i2c: i801: SMBus patch for Intel Coleto Creek DeviceIDs
i2c: omap: correct usage of the interrupt enable register
i2c-pxa: prepare clock before use
Revert "i2c: core: make it possible to match a pure device tree driver"
i2c: nomadik: allocate adapter number dynamically
i2c: nomadik: support elder Nomadiks
i2c: mv64xxx: Add Allwinner sun4i compatible
i2c: mv64xxx: make the registers offset configurable
i2c: mv64xxx: Add macros to access parts of registers
i2c: vt8500: Add support for I2C bus on Wondermedia SoCs
i2c: designware: fix race between subsequent xfers
i2c: bfin-twi: Read and write the FIFO in loop
i2c: core: make it possible to match a pure device tree driver
...
Merge Kconfig menu diet patches from Dave Hansen:
"I think the "Kernel Hacking" menu has gotten a bit out of hand. It is
over 120 lines long on my system with everything enabled and options
are scattered around it haphazardly.
http://sr71.net/~dave/linux/kconfig-horror.png
Let's try to introduce some sanity. This set takes that 120 lines
down to 55 and makes it vastly easier to find some things. It's a
start.
This set stands on its own, but there is plenty of room for follow-up
patches. The arch-specific debug options still end up getting stuck
in the top-level "kernel hacking" menu. OPTIMIZE_INLINING, for
instance, could obviously go in to the "compiler options" menu, but
the fact that it is defined in arch/ in a separate Kconfig file keeps
it on its own for the moment.
The Signed-off-by's in here look funky. I changed employers while
working on this set, so I have signoffs from both email addresses"
* emailed patches from Dave Hansen <dave@sr71.net>:
hang and lockup detection menu
kconfig: consolidate printk options
group locking debugging options
consolidate compilation option configs
consolidate runtime testing configs
order memory debugging Kconfig options
consolidate per-arch stack overflow debugging options
Original posting:
http://lkml.kernel.org/r/20121214184202.F54094D9@kernel.stglabs.ibm.com
Several architectures have similar stack debugging config options.
They all pretty much do the same thing, some with slightly
differing help text.
This patch changes the architectures to instead enable a Kconfig
boolean, and then use that boolean in the generic Kconfig.debug
to present the actual menu option. This removes a bunch of
duplication and adds consistency across arches.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [for tile]
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge first patch-bomb from Andrew Morton:
- various misc bits
- I'm been patchmonkeying ocfs2 for a while, as Joel and Mark have been
distracted. There has been quite a bit of activity.
- About half the MM queue
- Some backlight bits
- Various lib/ updates
- checkpatch updates
- zillions more little rtc patches
- ptrace
- signals
- exec
- procfs
- rapidio
- nbd
- aoe
- pps
- memstick
- tools/testing/selftests updates
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (445 commits)
tools/testing/selftests: don't assume the x bit is set on scripts
selftests: add .gitignore for kcmp
selftests: fix clean target in kcmp Makefile
selftests: add .gitignore for vm
selftests: add hugetlbfstest
self-test: fix make clean
selftests: exit 1 on failure
kernel/resource.c: remove the unneeded assignment in function __find_resource
aio: fix wrong comment in aio_complete()
drivers/w1/slaves/w1_ds2408.c: add magic sequence to disable P0 test mode
drivers/memstick/host/r592.c: convert to module_pci_driver
drivers/memstick/host/jmb38x_ms: convert to module_pci_driver
pps-gpio: add device-tree binding and support
drivers/pps/clients/pps-gpio.c: convert to module_platform_driver
drivers/pps/clients/pps-gpio.c: convert to devm_* helpers
drivers/parport/share.c: use kzalloc
Documentation/accounting/getdelays.c: avoid strncpy in accounting tool
aoe: update internal version number to v83
aoe: update copyright date
aoe: perform I/O completions in parallel
...
Prepare for removing num_physpages and simplify mem_init().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com> # for arch/arc
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Concentrate code to modify totalram_pages into the mm core, so the arch
memory initialized code doesn't need to take care of it. With these
changes applied, only following functions from mm core modify global
variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
free_all_bootmem_node(), adjust_managed_page_count().
With this patch applied, it will be much more easier for us to keep
totalram_pages and zone->managed_pages in consistence.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Address more review comments from last round of code review.
1) Enhance free_reserved_area() to support poisoning freed memory with
pattern '0'. This could be used to get rid of poison_init_mem()
on ARM64.
2) A previous patch has disabled memory poison for initmem on s390
by mistake, so restore to the original behavior.
3) Remove redundant PAGE_ALIGN() when calling free_reserved_area().
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change signature of free_reserved_area() according to Russell King's
suggestion to fix following build warnings:
arch/arm/mm/init.c: In function 'mem_init':
arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
^
In file included from include/linux/mman.h:4:0,
from arch/arm/mm/init.c:15:
include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
mm/page_alloc.c: In function 'free_reserved_area':
>> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
In file included from arch/mips/include/asm/page.h:49:0,
from include/linux/mmzone.h:20,
from include/linux/gfp.h:4,
from include/linux/mm.h:8,
from mm/page_alloc.c:18:
arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
mm/page_alloc.c: In function 'free_area_init_nodes':
mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]
Also address some minor code review comments.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Highlights of changes:
-Continuation of ARC MM changes from 3.10 including
zero page optimization;
Setting pagecache pages dirty by default;
Non executable stack by default;
Reducing dcache flushes for aliasing VIPT config
-Long overdue rework of pt_regs machinery - removing the unused word gutters
and adding ECR register to baseline (helps cleanup lot of low level code)
-Support for ARC gcc 4.8
-Few other preventive fixes, cosmetics, usage of Kconfig helper..
The diffstat is larger than normal primarily because of arcregs.h header split
as well as beautification of macros in entry.h
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJR0mEbAAoJEGnX8d3iisJe0lEP+gPtakLaOBprpxrVttK3DCLw
E28LMVdR+33jfqugqk8xMVQh9gc5mvSGGJkZFGPoQX9gumdDjJP4vRxqlZxLgpNH
siNr+LvyzUOUa3bQrB6q639jlqlvPcKO19Yjzx1vqKRkMBsV8+YG4+8tdBNvh2K4
dr6nB6gqmZiCWmuUpTQ+D2J7vjlNwi5PWHBmlP0uFH433Z78Wk2tT2LWUHihl6Do
+Uhp/Ldu7EDFTVLDlqc5pv0EUNnGzuR9dpN3XuxfZLknKMtJKpwqLq0/6bnfWqck
g+Cmh8jvj5SAutbzZiG/xggZMtSHOXsqY06+MALquQ7tcoXrm7U0Ubbplt1B55y/
OW8yWD/b1S+2pVVCmbIWV2EinNmSMpdX18nVUmWSggQYBjKQnMq611Y7q1EGBmen
cTCY9u7jv7tD+Qem4H/rze7b+EKlcSFDhzL/Df/iHHnvAmEjb+trmD3qJ6PtA93w
24b+fVaBi5j5dHZx4NEggDKTkmUVcqkD2qfg4Ckg4mqq7cBHfJponuoDJX4KM7t1
qozSpKLMMSzRd3L7DQhePqNJfrlOD2rhD4ELRCJA131dUANr4Wpi2yvIASro135A
4gmpYX5WtjohWjL73aRCwsRcLTnyNmqjPfibT35ZaAygW7wXII7zNVzKY09oEPN3
keol0YTXDokCzGN432NI
=65Fq
-----END PGP SIGNATURE-----
Merge tag 'arc-v3.11-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull first batch of ARC changes from Vineet Gupta:
"There's a second bunch to follow next week - which depends on commits
on other trees (irq/net). I'd have preferred the accompanying ARC
change via respective trees, but it didn't workout somehow.
Highlights of changes:
- Continuation of ARC MM changes from 3.10 including
zero page optimization
Setting pagecache pages dirty by default
Non executable stack by default
Reducing dcache flushes for aliasing VIPT config
- Long overdue rework of pt_regs machinery - removing the unused word
gutters and adding ECR register to baseline (helps cleanup lot of
low level code)
- Support for ARC gcc 4.8
- Few other preventive fixes, cosmetics, usage of Kconfig helper..
The diffstat is larger than normal primarily because of arcregs.h
header split as well as beautification of macros in entry.h"
* tag 'arc-v3.11-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (32 commits)
ARC: warn on improper stack unwind FDE entries
arc: delete __cpuinit usage from all arc files
ARC: [tlb-miss] Fix bug with CONFIG_ARC_DBG_TLB_MISS_COUNT
ARC: [tlb-miss] Extraneous PTE bit testing/setting
ARC: Adjustments for gcc 4.8
ARC: Setup Vector Table Base in early boot
ARC: Remove explicit passing around of ECR
ARC: pt_regs update #5: Use real ECR for pt_regs->event vs. synth values
ARC: stop using pt_regs->orig_r8
ARC: pt_regs update #4: r25 saved/restored unconditionally
ARC: K/U SP saved from one location in stack switching macro
ARC: Entry Handler tweaks: Simplify branch for in-kernel preemption
ARC: Entry Handler tweaks: Avoid hardcoded LIMMS for ECR values
ARC: Increase readability of entry handlers
ARC: pt_regs update #3: Remove unused gutter at start of callee_regs
ARC: pt_regs update #2: Remove unused gutter at start of pt_regs
ARC: pt_regs update #1: Align pt_regs end with end of kernel stack page
ARC: pt_regs update #0: remove kernel stack canary
ARC: [mm] Remove @write argument to do_page_fault()
ARC: [mm] Make stack/heap Non-executable by default
...
Device tree and Kconfig updates for irqchip driver.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.
After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.
Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.
This removes all the arch/arc uses of the __cpuinit macros from
all C files. Currently arc does not have any __CPUINIT used in
assembly files.
[1] https://lkml.org/lkml/2013/5/20/589
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
LOAD_FAULT_PTE macro is expected to set r2 with faulting vaddr.
However in case of CONFIG_ARC_DBG_TLB_MISS_COUNT, it was getting
clobbered with statistics collection code.
Fix latter by using a different register.
Note that only I-TLB Miss handler was potentially affected.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* No need to check for READ access in I-TLB Miss handler
* Redundant PAGE_PRESENT update in PTE
Post TLB entry installation, in updating PTE for software accessed/dity
bits, no need to update PAGE_PRESENT since it will already be set.
Infact the entry won't have installed if !PAGE_PRESENT.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* DWARF unwinder related
+ Force DWARF2 compliant .debug_frame (gcc 4.8 defaults to DWARF4
which kernel unwinder can't grok).
+ Discard the additional .eh_frame generated
+ Discard the dwarf4 debug info generated by -gdwarf-2 for normal
no debug case
* 4.8 already uses arc600 multilibs for -mno-mpy
* switch to using uclibc compiler (to get -mmedium-calls and -mno-sdata)
and also since buildroot can only use 1 toolchain
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This patch makes the SDA hold time configurable through device tree.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com> for arch/arc bits
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Otherwise early boot exceptions such as instructions errors due to
configuration mismatch between kernel and hardware go off to la-la land,
as opposed to hitting the handler and panic()'ing properly.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
With ECR now part of pt_regs
* No need to propagate from lowest asm handlers as arg
* No need to save it in tsk->thread.cause_code
* Avoid bit chopping to access the bit-fields
More code consolidation, cleanup
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
pt_regs->event was set with artificial values to identify the low level
system event (syscall trap / breakpoint trap / exceptions / interrupts)
With r8 saving out of the way, the full word can be used to save real
ECR (Exception Cause Register) which helps idenify the event naturally,
including additional info such as cause code, param.
Only for Interrupts, where ECR is not applicable, do we resort to
synthetic non ECR values.
SAVE_ALL_TRAP/EXCEPTIONS can now be merged as they both use ECR with
different runtime values.
The ptrace helpers now use the sub-fields of ECR to distinguish the
events (e.g. vector 0x25 is trap, param 0 is syscall...)
The following benefits will follow:
(1) This centralizes the location of where ECR is saved and will allow
the cleanup of task->thread.cause_code ECR placeholder which is set
in non-uniform way. Then ARC VM code can safely rely on it being
there for purpose of finer grained VM_EXEC dcache flush (based on
exec fault: I-TLB Miss)
(2) Further, ECR being passed around from low level handlers as arg can
be eliminated as it is part of standard reg-file in pt_regs
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Historically, pt_regs have had orig_r8, an overloaded container for
(1) backup copy of r8 (syscall number Trap Exceptions)
(2) additional system state: (syscall/Exception/Interrupt)
There is no point in keeping (1) since syscall number is never clobbered
in-place, in pt_regs, unlike r0 which duals as first syscall arg as well
as syscall return value and in case of syscall restart, the orig arg0
needs restoring (from orig_r0) after having been updated in-place with
syscall ret value.
This further paves way to convert (2) to contain ECR itself (rather than
current madeup values)
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
(This is a VERY IMP change for low level interrupt/exception handling)
-----------------------------------------------------------------------
WHAT
-----------------------------------------------------------------------
* User 25 now saved in pt_regs->user_r25 (vs. tsk->thread_info.user_r25)
* This allows Low level interrupt code to unconditionally save r25
(vs. the prev version which would only do it for U->K transition).
Ofcourse for nested interrupts, only the pt_regs->user_r25 of
bottom-most frame is useful.
* simplifies the interrupt prologue/epilogue
* Needed for ARCv2 ISA code and done here to keep design similar with
ARCompact event handling
-----------------------------------------------------------------------
WHY
-------------------------------------------------------------------------
With CONFIG_ARC_CURR_IN_REG, r25 is used to cache "current" task pointer
in kernel mode. So when entering kernel mode from User Mode
- user r25 is specially safe-kept (it being a callee reg is NOT part of
pt_regs which are saved by default on each interrupt/trap/exception)
- r25 loaded with current task pointer.
Further, if interrupt was taken in kernel mode, this is skipped since we
know that r25 already has valid "current" pointer.
With 2 level of interrupts in ARCompact ISA, detecting this is difficult
but still possible, since we could be in kernel mode but r25 not already saved
(in fact the stack itself might not have been switched).
A. User mode
B. L1 IRQ taken
C. L2 IRQ taken (while on 1st line of L1 ISR)
So in #C, although in kernel mode, r25 not saved (infact SP not
switched at all)
Given that ARcompact has manual stack switching, we could use a bit of
trickey - The low level code would make sure that SP is only set to kernel
mode value at the very end (after saving r25). So a non kernel mode SP,
even if in kernel mode, meant r25 was NOT saved.
The same paradigm won't work in ARCv2 ISA since SP is auto-switched so
it's setting can't be delayed/constrained.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This paves way for further simplifications.
There's an overhead of 1 insn for the non-common case of interrupt taken
from kernel mode.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* use artificial PUSH/POP contructs for CORE Reg save/restore to stack
* use artificial PUSHAX/POPAX contructs for Auxiliary Space regs
* macro'ize multiple copies of callee-reg-save/restore (SAVE_R13_TO_R24)
* use BIC insn for inverse-and operation
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This is trickier than prev two:
* context switching code saves kernel mode callee regs in the format of
struct callee_regs thus needs adjustment. This also reduces the height
of topmost kernel stack frame by 1 word.
* Since kernel stack unwinder is sensitive to height of topmost kernel
stack frame, that needs a word of adjustment too.
ptrace needs a bit of updating since pt_regs now diverges from
user_regs_struct.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Historically, pt_regs would end at offset of 1 word from end of stack
page.
----------------- -> START of page (task->stack)
| |
| thread_info |
-----------------
| |
^ ~ ~
| ~ ~
| | |
| | | <---- pt_regs used to END here
-----------------
| 1 word GUTTER |
----------------- -> End of page (START of kernel stack)
This required special "one-off" considerations in low level code.
The root cause is very likely assumption of "empty" SP by the original
ARC kernel hackers, despite ARC700 always been "full" SP.
So finally RIP one word gutter !
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This can be ascertained within do_page_fault() since it gets the full
ECR (Exception Cause Register).
Further, for both the callers of do_page_fault(): Prot-V / D-TLB-Miss,
the cause sub-fields in ECR are same for same type of access, making the
code much more simpler.
D-TLB-Miss [LD] 0x00_21_01_00
Prot-V [LD] 0x00_23_01_00
^^
D-TLB-Miss [ST] 0x00_21_02_00
Prot-V [ST] 0x00_23_02_00
^^
D-TLB-Miss [EX] 0x00_21_03_00
Prot-V [EX] 0x00_23_03_00
^^
This helps code consolidation, which is even better when moving code from
assembler to "C".
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
1. For VM_EXEC based delayed dcache/icache flush, reduces the number of
flushes.
2. Makes this security feature ON by default rather than OFF before.
3. Applications can use mprotect() to selectively override this.
4. ELF binaries have a GNU_STACK segment which can easily override the
kernel default permissions.
For nested-functions/trampolines, gcc already auto-enables executable
stack in elf. Others needing this can use -Wl,-z,execstack option.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Non-congruent SRC page in copy_user_page() is dcache clean in the end -
so record that fact, to avoid a subsequent extraneous flush.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* Number of (i|d)cache ways can be retrieved from BCRs and hence no need
to cross check with with built-in constants
* Use of IS_ENABLED() to check for a Kconfig option
* is_not_cache_aligned() not used anymore
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* Move the various sub-system defines/types into relevant files/functions
(reduces compilation time)
* move CPU specific stuff out of asm/tlb.h into asm/mmu.h
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This fixes the following:
- CONFIG_ARC_SERIAL_BAUD is only defined when CONFIG_SERIAL_ARC is defined.
Make sure that it isn't referenced otherwise.
- There is no use for initializing arc_uart_info[] when CONFIG_SERIAL_ARC is
not defined.
[vgupta: tweaked changelog title, used IS_ENABLED() kconfig helper]
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
gdbserver inserting a breakpoint ends up calling copy_user_page() for a
code page. The generic version of which (non-aliasing config) didn't set
the PG_arch_1 bit hence update_mmu_cache() didn't sync dcache/icache for
corresponding dynamic loader code page - causing garbade to be executed.
So now aliasing versions of copy_user_highpage()/clear_page() are made
default. There is no significant overhead since all of special alias
handling code is compiled out for non-aliasing build
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The current code uses 2 bits for determining page's dcache color, thus
sorting pages into 4 bins, whereas the aliasing dcache really has 2 bins
(8k page, 64k dcache - 4 way-set-assoc).
This can cause extraneous flushes - e.g. color 0 and 2.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The VM_EXEC check in update_mmu_cache() was getting optimized away
because of a stupid error in definition of macro addr_not_cache_congruent()
The intention was to have the equivalent of following:
if (a || (1 ? b : 0))
but we ended up with following:
if (a || 1 ? b : 0)
And because precedence of '||' is more that that of '?', gcc was optimizing
away evaluation of <a>
Nasty Repercussions:
1. For non-aliasing configs it would mean some extraneous dcache flushes
for non-code pages if U/K mappings were not congruent.
2. For aliasing config, some needed dcache flush for code pages might
be missed if U/K mappings were congruent.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This manifested as grep failing psuedo-randomly:
-------------->8---------------------
[ARCLinux]$ ip address show lo | grep inet
[ARCLinux]$ ip address show lo | grep inet
[ARCLinux]$ ip address show lo | grep inet
[ARCLinux]$
[ARCLinux]$ ip address show lo | grep inet
inet 127.0.0.1/8 scope host lo
-------------->8---------------------
ARC700 MMU provides fully orthogonal permission bits per page:
Ur, Uw, Ux, Kr, Kw, Kx
The user mode page permission templates used to have all Kernel mode
access bits enabled.
This caused a tricky race condition observed with uClibc buffered file
read and UNIX pipes.
1. Read access to an anon mapped page in libc .bss: write-protected
zero_page mapped: TLB Entry installed with Ur + K[rwx]
2. grep calls libc:getc() -> buffered read layer calls read(2) with the
internal read buffer in same .bss page.
The read() call is on STDIN which has been redirected to a pipe.
read(2) => sys_read() => pipe_read() => copy_to_user()
3. Since page has Kernel-write permission (despite being user-mode
write-protected), copy_to_user() suceeds w/o taking a MMU TLB-Miss
Exception (page-fault for ARC). core-MM is unaware that kernel
erroneously wrote to the reserved read-only zero-page (BUG #1)
4. Control returns to userspace which now does a write to same .bss page
Since Linux MM is not aware that page has been modified by kernel, it
simply reassigns a new writable zero-init page to mapping, loosing the
prior write by kernel - effectively zero'ing out the libc read buffer
under the hood - hence grep doesn't see right data (BUG #2)
The fix is to make all kernel-mode access permissions mirror the
user-mode ones. Note that the kernel still has full access to pages,
when accessed directly (w/o MMU) - this fix ensures that kernel-mode
access in copy_to_from() path uses the same faulting access model as for
pure user accesses to keep MM fully aware of page state.
The issue is peudo-random because it only shows up if the TLB entry
installed in #1 is present at the time of #3. If it is evicted out, due
to TLB pressure or some-such, then copy_to_user() does take a TLB Miss
Exception, with a routine write-to-anon COW processing installing a
fresh page for kernel writes and also usable as it is in userspace.
Further the issue was dormant for so long as it depends on where the
libc internal read buffer (in .bss) is mapped at runtime.
If it happens to reside in file-backed data mapping of libc (in the
page-aligned slack space trailing the file backed data), loader zero
padding the slack space, does the early cow page replacement, setting
things up at the very beginning itself.
With gcc 4.8 based builds, the libc buffer got pushed out to a real
anon mapping which triggers the issue.
Reported-by: Anton Kolesov <akolesov@synopsys.com>
Cc: <stable@vger.kernel.org> # 3.9
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Flush and INVALIDATE the dcache page.
This helper is only used for writeback of CODE pages to memory. So
there's no value in keeping the dcache lines around. Infact it is risky
as a writeback on natural eviction under pressure can cause un-needed
writeback with weird issues on aliasing dcache configurations.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The TB10x platform port includes a custom mechanism using to set up
default pin controller configurations using abilis,simple-default
pin configurations of nodes compatible with abilis,simple-pinctrl. This
mechanism is redundant with the Linux standard "default" pin
configuration, see commit ab78029ecc
"drivers/pinctrl: grab default handles from device core".
This patch removes the TB10x custom mechanism in favour of the Linux
standard.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Reviewed-by: Stephen Warren <swarren@nvidia.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
I'm satisified with testing, specially with fuse which has historically given
grief to VIPT arches (ARM/PARISC...)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRjPVlAAoJEGnX8d3iisJeF60P/36JELOrdTOJB8GDn/i22yu4
zxFrroH4EpXCobe4JzrmhXU0vdMKmyhaRsdmQt2pV474LX1ajQeoRi6n7xuiYymr
p0H8/jde7ZDHGxJRP9OIuIIU57N+sVwQvrXvfnz/dc4c92MD9EWEwvoytnCVuKSx
k/YIQ5oDj6hFH13V68eXXs+KBrXyPiYO9UIWYmwRZv/0Bm6P1EpXQSMWzeCP0X/y
FHVEc4i92bIkqpOadUnhCBHGZqjkrhwFL2FCI35/12TgSwjPU1Q6IJCtH7Lg9ygI
oFqut5wN+dhkxlfiyvgTXErmnvhDOHLbMQJYC9svHin8TtfznQhJDAWYeSH/6Mde
/hE/bXV55ucdJ48ZT6JRvxHeJQgTAvbXDIIQYe/wAJEvXidWEo4Y/qRTIXP03Ixm
Ie69d9WSmUlx4FmYxBQodDQFK9slnc+0UfsWcCG+OrT0wwrKBRr3To2WYEFowQer
fpIk6/68+LiQK+IzFAhPtBppSgcQoEBsXAD/MDKsQcRk84W1nKPt6SiT/wCAzOZ7
ezTL1eSlfzWXNbtohdm8xN8frt/4fZLZTaZkqtnMPBi0iIC5DAsJy27YlBQDaRd/
Hzd0y6bJQDcsjjWmZuUkFZVbCyYlE0CIW6sbHC91i51egKReYHy7T6Ef/n+qn+ho
jxohq5/JvG+0Vha0Q2h6
=r4lw
-----END PGP SIGNATURE-----
Merge tag 'arc-v3.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull second set of arc arch updates from Vineet Gupta:
"Aliasing VIPT dcache support for ARC
I'm satisified with testing, specially with fuse which has
historically given grief to VIPT arches (ARM/PARISC...)"
* tag 'arc-v3.10-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: [TB10x] Remove GENERIC_GPIO
ARC: [mm] Aliasing VIPT dcache support 4/4
ARC: [mm] Aliasing VIPT dcache support 3/4
ARC: [mm] Aliasing VIPT dcache support 2/4
ARC: [mm] Aliasing VIPT dcache support 1/4
ARC: [mm] refactor the core (i|d)cache line ops loops
ARC: [mm] serious bug in vaddr based icache flush
* Support for two new platforms based on ARC700
- Abilis TB10x SoC [Chritisian/Pierrick]
- Simulator only System-C Model [Mischa]
* ARC specific MM improvements
- Avoid full TLB flush (ASID increment) on munmap (even single page)
- VIPT Cache Flushing improvements
+ Delayed dcache flush for non-aliasing dcache (big performance boost)
+ icache flush aliasing agnostic (no need to kill all possible aliases)
* Others
- Avoid needless rebuild of DTB files for every kernel build
- Remove builtin cmdline as that is already provided by DeviceTree/bootargs
- Fixing unaligned access emulation corner case
- checkpatch fixes [Sachin]
- Various fixlets [Noam]
- Minor build failures/cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRiydEAAoJEGnX8d3iisJewBkQAJ/cvIjrIuMMdeDo0bokzZN2
bfsG8U+V7S0CKjqLyUD1bMRXo7rTgus8hp/klVORRXoAwSKiWhkj0p6dqJjCGUGc
LqtEPlxaLR1X+pe5wKtpB3j6kTpRwictVhFUqkFACUxGZx1GbYFxdeL8+3oDsepW
lwidBYgoya+Q4puRfmY/sCTtVhJlwTUi6g0lmpaEPWc3T2s83/u1GnlVBYd9B7yA
fCYkUceC2ZnuRMfH00sRsjQ3USyyYptGpY8U1nv9STFJ3fC+EestBPppAAwAzcm0
iiRgo3s314EzfeNRgzjjHrbULnVUJWkrdFXPisWepyPyVjsgnVr84YkO3kiEN6l7
JM1cUCvJUbKZaOb2iUwrlPOqISNjCyY9QZWwqW6PCG3TBpfQsJM2mnXKnPLvV2v2
5Ttba4+ETZ3rn4pPIuTeC6REr0aV/Zl1LKeWuk8PXz4aljWrlOrifBN6QkXWr4HU
4z6X3j2nw4T2LzqwbNYF+xLaDaZZpV8UdvQwOkliCxZ04myx135ImvgOim7Hh9j+
Ow1Jp9mRZha/44qM3jXdZ6Cv55pEOR4oQJsl6OBNUXgb5bnDcFHz1UpZtzoZ2V+s
RPozsfnleNXxDIJlCGK96+PG5qVTRQvklowsz6aTP8r+EEbQnrRlnrhi82cQWqnM
sMxzN320zrt/MWXxec89
=WeDp
-----END PGP SIGNATURE-----
Merge tag 'arc-v3.10-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC port updates from Vineet Gupta:
"Support for two new platforms based on ARC700:
- Abilis TB10x SoC [Chritisian/Pierrick]
- Simulator only System-C Model [Mischa]
ARC specific MM improvements:
- Avoid full TLB flush (ASID increment) on munmap (even single page)
- VIPT Cache Flushing improvements
+ Delayed dcache flush for non-aliasing dcache (big performance boost)
+ icache flush aliasing agnostic (no need to kill all possible aliases)
Others:
- Avoid needless rebuild of DTB files for every kernel build
- Remove builtin cmdline as that is already provided by DeviceTree/bootargs
- Fixing unaligned access emulation corner case
- checkpatch fixes [Sachin]
- Various fixlets [Noam]
- Minor build failures/cleanups"
* tag 'arc-v3.10-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (35 commits)
ARC: [mm] Lazy D-cache flush (non aliasing VIPT)
ARC: [mm] micro-optimize page size icache invalidate
ARC: [mm] remove the pessimistic all-alias-invalidate icache helpers
ARC: [mm] consolidate icache/dcache sync code
ARC: [mm] optimise icache flush for kernel mappings
ARC: [mm] optimise icache flush for user mappings
ARC: [mm] optimize needless full mm TLB flush on munmap
ARC: Add support for nSIM OSCI System C model
ARC: [TB10x] Adapt device tree to new compatible string
ARC: [TB10x] Add support for TB10x platform
ARC: [TB10x] Device tree of TB100 and TB101 Development Kits
ARC: Prepare interrupt code for external controllers
ARC: Allow embedded arc-intc to be properly placed in DT intc hierarchy
ARC: [cmdline] Don't overwrite u-boot provided bootargs
ARC: [cmdline] Remove CONFIG_CMDLINE
ARC: [plat-arcfpga] defconfig update
ARC: unaligned access emulation broken if callee-reg dest of LD/ST
ARC: unaligned access emulation error handling consolidation
ARC: Debug/crash-printing Improvements
ARC: fix typo with clock speed
...
This is the meat of the series which prevents any dcache alias creation
by always keeping the U and K mapping of a page congruent.
If a mapping already exists, and other tries to access the page, prev
one is flushed to physical page (wback+inv)
Essentially flush_dcache_page()/copy_user_highpage() create K-mapping
of a page, but try to defer flushing, unless U-mapping exist.
When page is actually mapped to userspace, update_mmu_cache() flushes
the K-mapping (in certain cases this can be optimised out)
Additonally flush_cache_mm(), flush_cache_range(), flush_cache_page()
handle the puring of stale userspace mappings on exit/munmap...
flush_anon_page() handles the existing U-mapping for anon page before
kernel reads it via the GUP path.
Note that while not complete, this is enough to boot a simple
dynamically linked Busybox based rootfs
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This preps the low level dcache flush helpers to take vaddr argument in
addition to the existing paddr to properly flush the VIPT dcache
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Nothing semantical
* simplify the alignement code by using & operation only
* rename variables clearly as paddr
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
vaddr used to index the cache was clipped from the wrong end, and thus
would potentially fail to flush the correct lines.
The problem was dorment for so long because up until the recent
optimizations it was only used for ptrace break-point only flushes.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
flush_dcache_page( ) is MM hook to ensure that a page has consistent
views between kernel and userspace. Thus it is called when
* kernel writes to a page which at some later point could get mapped to
userspace (so kernel mapping needs to be flushed-n-inv)
* kernel is about to read from a page with possible userspace mappings
(so userspace mappings needs to be made coherent with kernel ones)
However for Non aliasing VIPT dcache, any userspace mapping will always
be congruent to kernel mapping. Thus d-cache need need not be flushed at
all (or delayed indefinitely).
The only reason it does need to be flushed is when mapping code pages.
Since icache doesn't snoop dcache, those dirty dcache lines need to be
written back to memory and icache line invalidated so that icache lines
fetch will get the right data.
Decent gains on LMBench fork/exec/sh and File I/O micro-benchmarks.
(1) FPGA @ 80 MHZ
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.9-rc6-a Linux 3.9.0-r 80 4.79 8.72 66.7 116. 239. 8.39 30.4 4798 14.K 34.K
3.9-rc6-b Linux 3.9.0-r 80 4.79 8.62 65.4 111. 239. 8.35 29.0 3995 12.K 30.K
3.9-rc7-c Linux 3.9.0-r 80 4.79 9.00 66.1 106. 239. 8.61 30.4 2858 10.K 24.K
^^^^ ^^^^ ^^^
File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page 100fd
Create Delete Create Delete Latency Fault Fault selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
3.9-rc6-a Linux 3.9.0-r 317.8 204.2 1122.3 375.1 3522.0 4.288 20.7 126.8
3.9-rc6-b Linux 3.9.0-r 298.7 223.0 1141.6 367.8 3531.0 4.866 20.9 126.4
3.9-rc7-c Linux 3.9.0-r 278.4 179.2 862.1 339.3 3705.0 3.223 20.3 126.6
^^^^^ ^^^^^ ^^^^^ ^^^^
(2) Customer Silicon @ 500 MHz (166 MHz mem)
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
abilis-ba Linux 3.9.0-r 497 0.71 1.38 4.58 12.0 35.5 1.40 3.89 2070 5525 13.K
abilis-ca Linux 3.9.0-r 497 0.71 1.40 4.61 11.8 35.6 1.37 3.92 1411 4317 10.K
^^^^ ^^^^ ^^^
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
start address is already page aligned and size is const PAGE_SIZE,
thus fixups for alignment not needed in generated code.
bloat-o-meter vmlinux-mm5 vmlinux
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-32 (-32)
function old new delta
__inv_icache_page 82 50 -32
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Now that we have same helper used for all icache invalidates (i.e.
vaddr+paddr based exact line invalidate), consolidate the open coded
calls into one place.
Also rename flush_icache_range_vaddr => __sync_icache_dcache
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This change continues the theme from prev commit - this time icache
handling for kernel's own code modification (vmalloc: loadable modules,
breakpoints for kprobes/kgdb...)
flush_icache_range() calls the CDU icache helper with vaddr to enable
exact line invalidate.
For a true kernel-virtual mapping, the vaddr is actually virtual hence
valid as index into cache. For kprobes breakpoint however, the vaddr arg
is actually paddr - since that's how normal kernel is mapped in ARC
memory map. This implies that CDU will use the same addr for
indexing as for tag match - which is fine since kernel code would only
have that "implicit" mapping and none other.
This should speed up module loading significantly - specially on default
ARC700 icache configurations (32k) which alias.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
ARC icache doesn't snoop dcache thus executable pages need to be made
coherent before mapping into userspace in flush_icache_page().
However ARC700 CDU (hardware cache flush module) requires both vaddr
(index in cache) as well as paddr (tag match) to correctly identify a
line in the VIPT cache. A typical ARC700 SoC has aliasing icache, thus
the paddr only based flush_icache_page() API couldn't be implemented
efficiently. It had to loop thru all possible alias indexes and perform
the invalidate operation (ofcourse the cache op would only succeed at
the index(es) where tag matches - typically only 1, but the cost of
visiting all the cache-bins needs to paid nevertheless).
Turns out however that the vaddr (along with paddr) is available in
update_mmu_cache() hence better suits ARC icache flush semantics.
With both vaddr+paddr, exactly one flush operation per line is done.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
munmap ends up calling tlb_flush() which for ARC was flushing the entire
TLB unconditionally (by moving the MMU to a new ASID)
do_munmap
unmap_region
unmap_vmas
unmap_single_vma
unmap_page_range
tlb_start_vma
zap_pud_range
tlb_end_vma()
tlb_finish_mmu
tlb_flush() ---> unconditional flush_tlb_mm()
So even a single page munmap, a frequent operation when uClibc dynamic
linker (ldso) is loading the dependent shared libraries, would move the
the ASID multiple times - needlessly invalidating the pre-faulted TLB
entries (and increasing the rate of ASID wraparound + full TLB flush).
This is now optimised to only be called if tlb->full_mm (which means
for exit/execve) cases only. And for those cases, flush_tlb_mm() is
already optimised to be a no-op for mm->mm_users == 0.
So essentially there are no mmore full mm flushes - except for fork which
anyhow needs it for properly COW'ing parent address space.
munmap now needs to do TLB range flush, which is implemented with
tlb_end_vma()
Results
-------
1. ASID now consistenly moves by 4 during a simple ls (as opposed to 5 or
7 before).
2. LMBench microbenchmark also shows improvements
Basic system parameters
------------------------------------------------------------------------------
Host OS Description Mhz tlb cache mem scal
pages line par load
bytes
--------- ------------- ----------------------- ---- ----- ----- ------ ----
3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0404-gcc-4.4-ba 80 8 64 1.1000 1
3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0405-avoid-full 80 8 64 1.1200 1
Processor, Processes - times in microseconds - smaller is better
------------------------------------------------------------------------------
Host OS Mhz null null open slct sig sig fork exec sh
call I/O stat clos TCP inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
3.9-rc5-0 Linux 3.9.0-r 80 4.81 8.69 68.6 118. 239. 8.53 31.6 4839 13.K 34.K
3.9-rc5-0 Linux 3.9.0-r 80 4.46 8.36 53.8 91.3 223. 8.12 24.2 4725 13.K 33.K
File & VM system latencies in microseconds - smaller is better
-------------------------------------------------------------------------------
Host OS 0K File 10K File Mmap Prot Page 100fd
Create Delete Create Delete Latency Fault Fault selct
--------- ------------- ------ ------ ------ ------ ------- ----- ------- -----
3.9-rc5-0 Linux 3.9.0-r 314.7 223.2 1054.9 390.2 3615.0 1.590 20.1 126.6
3.9-rc5-0 Linux 3.9.0-r 265.8 183.8 1014.2 314.1 3193.0 6.910 18.8 110.4
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This adds support for an ARC Virtual Platform. This platform is based on the
System C standard promoted by the OSCI (Open System C Initiative) and uses
nSIM to simulate the ARC CPU core itself.
Users can build a virtual SoC by combining System C models of peripherals
and CPU cores.
Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The original device tree was written using a slightly different
implementation of the fixed-factor-clock device tree binding. The
compatible string must be modified in order to be compatible with the
new implementation.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Infrastructure required to make the Linux kernel compile and boot on the
Abilis Systems TB10x series of SOCs based on ARC700 CPUs:
- Kmake related files (Kconfig, Makefile, tb10x_defconfig)
- TB10x platform initialisation
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
These are the device tree files for the Abilis Systems TB100 and TB101 ICs and
their respective development kit PCBs. These files are committed in preparation
of the following patch set which adds support for these chips to the ARC
platform.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This patch adds some room for CPU-external interrupt controllers in the
Linux interrupt space. Until now, only the 32 CPU internal interrupt lines
were supported which does not allow for external interrupt controllers such
as GPIO modules etc.
Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com>
Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
arc-intc is initialized in arc common code as it is applicable to all
platforms. However platforms with their own external intc still need to
refer to it for correct DT interrupt tree hierarchy setup,
e.g.
static struct of_device_id __initdata tb10x_irq_ids[] = {
{ .compatible = "snps,arc700-intc", .data = dummy_init_irq },
{ .compatible = "abilis,tb10x_ictl", .data = tb10x_init_irq },
{},
};
The fix is to use the generic irqchip framework to tie all irqchips in
a special linker section and then call irqchip_init() which calls the
DT of_irq_init() for all the intc in one go.
That way the platform code need not be aware of arc-intc at all.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The existing code was wrong on several counts:
* uboot provided bootargs were copied into @boot_command_line, only to
be over-written by setup_machine_fdt(), effectively lost
* @cmdline_p returned by setup_arch() to start_kernel() didn't include
the DT /bootargs
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Given that DeviceTree /bootargs can provide similar functionality,
no point in providing duplicate infrastructure.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
The fixup code correctly updates the callee-regs on stack, but
fails to unwind it into actual register file. Thus userspace won't see
the update.
Reported-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
If CONFIG_ARC_MISALIGN_ACCESS is not enabled, or if the fixup fails,
call the same error handler: same signal/si_code to user (SIGBUS)
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
* Remove the line-break between scratch/callee-regs (sneaked in when we
converted from printk to pr_*
* Use %pS to print the symbol names of faulting PC (ret pseudo register)
and BLINK (call return register)
* Don't print user-vma for a kernel crash (only do it for
print-fatal-signals based regfile dump)
* Verbose print the Interrupt/Exception Enable/Active state
* for main executable link address is 0x10000 based (vs. 0) thus offset
of faulting PC needs to be adjusted
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
This tracks mainline commit ae903caae2 "Bury the conditionals from
kernel_thread/kernel_execve series" which we missed out as ARC port was
not yet mainline.
[vgupta: commit log modified]
Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
show_regs() is inherently arch-dependent but it does make sense to print
generic debug information and some archs already do albeit in slightly
different forms. This patch introduces a generic function to print debug
information from show_regs() so that different archs print out the same
information and it's much easier to modify what's printed.
show_regs_print_info() prints out the same debug info as dump_stack()
does plus task and thread_info pointers.
* Archs which didn't print debug info now do.
alpha, arc, blackfin, c6x, cris, frv, h8300, hexagon, ia64, m32r,
metag, microblaze, mn10300, openrisc, parisc, score, sh64, sparc,
um, xtensa
* Already prints debug info. Replaced with show_regs_print_info().
The printed information is superset of what used to be there.
arm, arm64, avr32, mips, powerpc, sh32, tile, unicore32, x86
* s390 is special in that it used to print arch-specific information
along with generic debug info. Heiko and Martin think that the
arch-specific extra isn't worth keeping s390 specfic implementation.
Converted to use the generic version.
Note that now all archs print the debug info before actual register
dumps.
An example BUG() dump follows.
kernel BUG at /work/os/work/kernel/workqueue.c:4841!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #7
Hardware name: empty empty/S3992, BIOS 080011 10/26/2007
task: ffff88007c85e040 ti: ffff88007c860000 task.ti: ffff88007c860000
RIP: 0010:[<ffffffff8234a07e>] [<ffffffff8234a07e>] init_workqueues+0x4/0x6
RSP: 0000:ffff88007c861ec8 EFLAGS: 00010246
RAX: ffff88007c861fd8 RBX: ffffffff824466a8 RCX: 0000000000000001
RDX: 0000000000000046 RSI: 0000000000000001 RDI: ffffffff8234a07a
RBP: ffff88007c861ec8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff8234a07a
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88007dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff88015f7ff000 CR3: 00000000021f1000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffff88007c861ef8 ffffffff81000312 ffffffff824466a8 ffff88007c85e650
0000000000000003 0000000000000000 ffff88007c861f38 ffffffff82335e5d
ffff88007c862080 ffffffff8223d8c0 ffff88007c862080 ffffffff81c47760
Call Trace:
[<ffffffff81000312>] do_one_initcall+0x122/0x170
[<ffffffff82335e5d>] kernel_init_freeable+0x9b/0x1c8
[<ffffffff81c47760>] ? rest_init+0x140/0x140
[<ffffffff81c4776e>] kernel_init+0xe/0xf0
[<ffffffff81c6be9c>] ret_from_fork+0x7c/0xb0
[<ffffffff81c47760>] ? rest_init+0x140/0x140
...
v2: Typo fix in x86-32.
v3: CPU number dropped from show_regs_print_info() as
dump_stack_print_info() has been updated to print it. s390
specific implementation dropped as requested by s390 maintainers.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com> [tile bits]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Both dump_stack() and show_stack() are currently implemented by each
architecture. show_stack(NULL, NULL) dumps the backtrace for the
current task as does dump_stack(). On some archs, dump_stack() prints
extra information - pid, utsname and so on - in addition to the
backtrace while the two are identical on other archs.
The usages in arch-independent code of the two functions indicate
show_stack(NULL, NULL) should print out bare backtrace while
dump_stack() is used for debugging purposes when something went wrong,
so it does make sense to print additional information on the task which
triggered dump_stack().
There's no reason to require archs to implement two separate but mostly
identical functions. It leads to unnecessary subtle information.
This patch expands the dummy fallback dump_stack() implementation in
lib/dump_stack.c such that it prints out debug information (taken from
x86) and invokes show_stack(NULL, NULL) and drops arch-specific
dump_stack() implementations in all archs except blackfin. Blackfin's
dump_stack() does something wonky that I don't understand.
Debug information can be printed separately by calling
dump_stack_print_info() so that arch-specific dump_stack()
implementation can still emit the same debug information. This is used
in blackfin.
This patch brings the following behavior changes.
* On some archs, an extra level in backtrace for show_stack() could be
printed. This is because the top frame was determined in
dump_stack() on those archs while generic dump_stack() can't do that
reliably. It can be compensated by inlining dump_stack() but not
sure whether that'd be necessary.
* Most archs didn't use to print debug info on dump_stack(). They do
now.
An example WARN dump follows.
WARNING: at kernel/workqueue.c:4841 init_workqueues+0x35/0x505()
Hardware name: empty
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.9.0-rc1-work+ #9
0000000000000009 ffff88007c861e08 ffffffff81c614dc ffff88007c861e48
ffffffff8108f50f ffffffff82228240 0000000000000040 ffffffff8234a03c
0000000000000000 0000000000000000 0000000000000000 ffff88007c861e58
Call Trace:
[<ffffffff81c614dc>] dump_stack+0x19/0x1b
[<ffffffff8108f50f>] warn_slowpath_common+0x7f/0xc0
[<ffffffff8108f56a>] warn_slowpath_null+0x1a/0x20
[<ffffffff8234a071>] init_workqueues+0x35/0x505
...
v2: CPU number added to the generic debug info as requested by s390
folks and dropped the s390 specific dump_stack(). This loses %ksp
from the debug message which the maintainers think isn't important
enough to keep the s390-specific dump_stack() implementation.
dump_stack_print_info() is moved to kernel/printk.c from
lib/dump_stack.c. Because linkage is per objecct file,
dump_stack_print_info() living in the same lib file as generic
dump_stack() means that archs which implement custom dump_stack()
- at this point, only blackfin - can't use dump_stack_print_info()
as that will bring in the generic version of dump_stack() too. v1
The v1 patch broke build on blackfin due to this issue. The build
breakage was reported by Fengguang Wu.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [s390 bits]
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon bits]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull SMP/hotplug changes from Ingo Molnar:
"This is a pretty large, multi-arch series unifying and generalizing
the various disjunct pieces of idle routines that architectures have
historically copied from each other and have grown in random, wildly
inconsistent and sometimes buggy directions:
101 files changed, 455 insertions(+), 1328 deletions(-)
this went through a number of review and test iterations before it was
committed, it was tested on various architectures, was exposed to
linux-next for quite some time - nevertheless it might cause problems
on architectures that don't read the mailing lists and don't regularly
test linux-next.
This cat herding excercise was motivated by the -rt kernel, and was
brought to you by Thomas "the Whip" Gleixner."
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
idle: Remove GENERIC_IDLE_LOOP config switch
um: Use generic idle loop
ia64: Make sure interrupts enabled when we "safe_halt()"
sparc: Use generic idle loop
idle: Remove unused ARCH_HAS_DEFAULT_IDLE
bfin: Fix typo in arch_cpu_idle()
xtensa: Use generic idle loop
x86: Use generic idle loop
unicore: Use generic idle loop
tile: Use generic idle loop
tile: Enter idle with preemption disabled
sh: Use generic idle loop
score: Use generic idle loop
s390: Use generic idle loop
powerpc: Use generic idle loop
parisc: Use generic idle loop
openrisc: Use generic idle loop
mn10300: Use generic idle loop
mips: Use generic idle loop
microblaze: Use generic idle loop
...
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>