linux

mirror of https://github.com/torvalds/linux.git synced 2024-12-11 13:41:55 +00:00

Author	SHA1	Message	Date
Linus Torvalds	d895cb1af1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs pile (part one) from Al Viro: "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent locking violations, etc. The most visible changes here are death of FS_REVAL_DOT (replaced with "has ->d_weak_revalidate()") and a new helper getting from struct file to inode. Some bits of preparation to xattr method interface changes. Misc patches by various people sent this cycle and ocfs2 fixes from several cycles ago that should've been upstream right then. PS: the next vfs pile will be xattr stuff." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits) saner proc_get_inode() calling conventions proc: avoid extra pde_put() in proc_fill_super() fs: change return values from -EACCES to -EPERM fs/exec.c: make bprm_mm_init() static ocfs2/dlm: use GFP_ATOMIC inside a spin_lock ocfs2: fix possible use-after-free with AIO ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero target: writev() on single-element vector is pointless export kernel_write(), convert open-coded instances fs: encode_fh: return FILEID_INVALID if invalid fid_type kill f_vfsmnt vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op nfsd: handle vfs_getattr errors in acl protocol switch vfs_getattr() to struct path default SET_PERSONALITY() in linux/elf.h ceph: prepopulate inodes only when request is aborted d_hash_and_lookup(): export, switch open-coded instances 9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate() 9p: split dropping the acls from v9fs_set_create_acl() ...	2013-02-26 20:16:07 -08:00
Wen Congyang	24d335ca36	memory-hotplug: introduce new arch_remove_memory() for removing page table For removing memory, we need to remove page tables. But it depends on architecture. So the patch introduce arch_remove_memory() for removing page table. Now it only calls __remove_pages(). Note: __remove_pages() for some archtecuture is not implemented (I don't know how to implement it for s390). Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Wu Jianguo <wujianguo@huawei.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-02-23 17:50:12 -08:00
Al Viro	496ad9aa8e	new helper: file_inode(file) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-02-22 23:31:31 -05:00
Kees Cook	0d57af1ec7	arch/sh: remove depends on CONFIG_EXPERIMENTAL The CONFIG_EXPERIMENTAL config item has not carried much meaning for a while now and is almost always enabled by default. As agreed during the Linux kernel summit, remove it from any "depends on" lines in Kconfigs. CC: Paul Mundt <lethal@linux-sh.org> CC: Tejun Heo <tj@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-21 14:43:13 -08:00
Linus Torvalds	3d59eebc5e	Automatic NUMA Balancing V11 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQIcBAABAgAGBQJQx0kQAAoJEHzG/DNEskfi4fQP/R5PRovayroZALBMLnVJDaLD Ttr9p40VNXbiJ+MfRgatJjSSJZ4Jl+fC3NEqBhcwVZhckZZb9R2s0WtrSQo5+ZbB vdRfiuKoCaKM4cSZ08C12uTvsF6xjhjd27CTUlMkyOcDoKxMEFKelv0hocSxe4Wo xqlv3eF+VsY7kE1BNbgBP06SX4tDpIHRxXfqJPMHaSKQmre+cU0xG2GcEu3QGbHT DEDTI788YSaWLmBfMC+kWoaQl1+bV/FYvavIAS8/o4K9IKvgR42VzrXmaFaqrbgb 72ksa6xfAi57yTmZHqyGmts06qYeBbPpKI+yIhCMInxA9CY3lPbvHppRf0RQOyzj YOi4hovGEMJKE+BCILukhJcZ9jCTtS3zut6v1rdvR88f4y7uhR9RfmRfsxuW7PNj 3Rmh191+n0lVWDmhOs2psXuCLJr3LEiA0dFffN1z8REUTtTAZMsj8Rz+SvBNAZDR hsJhERVeXB6X5uQ5rkLDzbn1Zic60LjVw7LIp6SF2OYf/YKaF8vhyWOA8dyCEu8W CGo7AoG0BO8tIIr8+LvFe8CweypysZImx4AjCfIs4u9pu/v11zmBvO9NO5yfuObF BreEERYgTes/UITxn1qdIW4/q+Nr0iKO3CTqsmu6L1GfCz3/XzPGs3U26fUhllqi Ka0JKgnWvsa6ez6FSzKI =ivQa -----END PGP SIGNATURE----- Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma Pull Automatic NUMA Balancing bare-bones from Mel Gorman: "There are three implementations for NUMA balancing, this tree (balancenuma), numacore which has been developed in tip/master and autonuma which is in aa.git. In almost all respects balancenuma is the dumbest of the three because its main impact is on the VM side with no attempt to be smart about scheduling. In the interest of getting the ball rolling, it would be desirable to see this much merged for 3.8 with the view to building scheduler smarts on top and adapting the VM where required for 3.9. The most recent set of comparisons available from different people are mel: https://lkml.org/lkml/2012/12/9/108 mingo: https://lkml.org/lkml/2012/12/7/331 tglx: https://lkml.org/lkml/2012/12/10/437 srikar: https://lkml.org/lkml/2012/12/10/397 The results are a mixed bag. In my own tests, balancenuma does reasonably well. It's dumb as rocks and does not regress against mainline. On the other hand, Ingo's tests shows that balancenuma is incapable of converging for this workloads driven by perf which is bad but is potentially explained by the lack of scheduler smarts. Thomas' results show balancenuma improves on mainline but falls far short of numacore or autonuma. Srikar's results indicate we all suffer on a large machine with imbalanced node sizes. My own testing showed that recent numacore results have improved dramatically, particularly in the last week but not universally. We've butted heads heavily on system CPU usage and high levels of migration even when it shows that overall performance is better. There are also cases where it regresses. Of interest is that for specjbb in some configurations it will regress for lower numbers of warehouses and show gains for higher numbers which is not reported by the tool by default and sometimes missed in treports. Recently I reported for numacore that the JVM was crashing with NullPointerExceptions but currently it's unclear what the source of this problem is. Initially I thought it was in how numacore batch handles PTEs but I'm no longer think this is the case. It's possible numacore is just able to trigger it due to higher rates of migration. These reports were quite late in the cycle so I/we would like to start with this tree as it contains much of the code we can agree on and has not changed significantly over the last 2-3 weeks." * tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits) mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable mm/rmap: Convert the struct anon_vma::mutex to an rwsem mm: migrate: Account a transhuge page properly when rate limiting mm: numa: Account for failed allocations and isolations as migration failures mm: numa: Add THP migration for the NUMA working set scanning fault case build fix mm: numa: Add THP migration for the NUMA working set scanning fault case. mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG mm: sched: numa: Control enabling and disabling of NUMA balancing mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships mm: numa: migrate: Set last_nid on newly allocated page mm: numa: split_huge_page: Transfer last_nid on tail page mm: numa: Introduce last_nid to the page frame sched: numa: Slowly increase the scanning period as NUMA faults are handled mm: numa: Rate limit setting of pte_numa if node is saturated mm: numa: Rate limit the amount of memory that is migrated between nodes mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting mm: numa: Migrate pages handled during a pmd_numa hinting fault mm: numa: Migrate on reference policy ...	2012-12-16 15:18:08 -08:00
David Rientjes	c2d23f919b	mm, oom: remove statically defined arch functions of same name out_of_memory() is a globally defined function to call the oom killer. x86, sh, and powerpc all use a function of the same name within file scope in their respective fault.c unnecessarily. Inline the functions into the pagefault handlers to clean the code up. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-12 17:38:34 -08:00
Linus Torvalds	608ff1a210	Merge branch 'akpm' (Andrew's patchbomb) Merge misc updates from Andrew Morton: "About half of most of MM. Going very early this time due to uncertainty over the coreautounifiednumasched things. I'll send the other half of most of MM tomorrow. The rest of MM awaits a slab merge from Pekka." * emailed patches from Andrew Morton: (71 commits) memory_hotplug: ensure every online node has NORMAL memory memory_hotplug: handle empty zone when online_movable/online_kernel mm, memory-hotplug: dynamic configure movable memory and portion memory drivers/base/node.c: cleanup node_state_attr[] bootmem: fix wrong call parameter for free_bootmem() avr32, kconfig: remove HAVE_ARCH_BOOTMEM mm: cma: remove watermark hacks mm: cma: skip watermarks check for already isolated blocks in split_free_page() mm, oom: fix race when specifying a thread as the oom origin mm, oom: change type of oom_score_adj to short mm: cleanup register_node() mm, mempolicy: remove duplicate code mm/vmscan.c: try_to_freeze() returns boolean mm: introduce putback_movable_pages() virtio_balloon: introduce migration primitives to balloon pages mm: introduce compaction and migration for ballooned pages mm: introduce a common interface for balloon pages mobility mm: redefine address_space.assoc_mapping mm: adjust address_space_operations.migratepage() return code arch/sparc/kernel/sys_sparc_64.c: s/COLOUR/COLOR/ ...	2012-12-11 18:05:37 -08:00
Michel Lespinasse	b4265f1234	mm: use vm_unmapped_area() on sh architecture Update the sh arch_get_unmapped_area[_topdown] functions to make use of vm_unmapped_area() instead of implementing a brute force search. [akpm@linux-foundation.org: remove now-unused COLOUR_ALIGN_DOWN()] Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-11 17:22:26 -08:00
Peter Zijlstra	cbee9f88ec	mm: numa: Add fault driven placement and migration NOTE: This patch is based on "sched, numa, mm: Add fault driven placement and migration policy" but as it throws away all the policy to just leave a basic foundation I had to drop the signed-offs-by. This patch creates a bare-bones method for setting PTEs pte_numa in the context of the scheduler that when faulted later will be faulted onto the node the CPU is running on. In itself this does nothing useful but any placement policy will fundamentally depend on receiving hints on placement from fault context and doing something intelligent about it. Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Rik van Riel <riel@redhat.com>	2012-12-11 14:42:45 +00:00
Cyril Chemparathy	7e6735c357	/dev/mem: use phys_addr_t for physical addresses This patch fixes the /dev/mem driver to use phys_addr_t for physical addresses. This is required on PAE systems, especially those that run entirely out of >4G physical memory space. Signed-off-by: Cyril Chemparathy <cyril@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-10-24 15:32:50 -07:00
Shaohua Li	45cac65b0f	readahead: fault retry breaks mmap file read random detection .fault now can retry. The retry can break state machine of .fault. In filemap_fault, if page is miss, ra->mmap_miss is increased. In the second try, since the page is in page cache now, ra->mmap_miss is decreased. And these are done in one fault, so we can't detect random mmap file access. Add a new flag to indicate .fault is tried once. In the second try, skip ra->mmap_miss decreasing. The filemap_fault state machine is ok with it. I only tested x86, didn't test other archs, but looks the change for other archs is obvious, but who knows :) Signed-off-by: Shaohua Li <shaohua.li@fusionio.com> Cc: Rik van Riel <riel@redhat.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:47 +09:00
Paul Mundt	90eed7d87b	sh: Fix up recursive fault in oops with unset TTB. Presently the oops code looks for the pgd either from the mm context or the cached TTB value. There are presently cases where the TTB can be unset or otherwise cleared by hardware, which we weren't handling, resulting in recursive faults on the NULL pgd. In these cases we can simply reload from swapper_pg_dir and continue on as normal. Cc: stable@vger.kernel.org Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-07-25 13:11:13 +09:00
Paul Mundt	0412ddc822	sh64: Fix up section mismatch warnings. WARNING: vmlinux.o(.cpuinit.text+0x280): Section mismatch in reference from the function cpu_probe() to the function .init.text:sh64_tlb_init() The function __cpuinit cpu_probe() references a function __init sh64_tlb_init(). If sh64_tlb_init is only used by cpu_probe then annotate sh64_tlb_init with a matching annotation. sh64_tlb_init() simply needs to be __cpuinit annotated, so fix that up. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-06-14 15:05:53 +09:00
Paul Mundt	d8fd35fc58	sh64: Fix up vmalloc fault range check. With the previous attempt reverted this switches to conditionalizing the end address. Nominally VMALLOC_END, but extended for P3_ADDR_MAX in the store queue case. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-18 20:01:16 +09:00
Paul Mundt	c3e0af9879	Revert "sh: Ensure fixmap and store queue space can co-exist." This reverts commit `20e7c297ef`. With store queues enabled the area above P4SEG has special properties from the MMU's point of view, which was causing fixmap failure. We'll have to do something else to satisfy the vmalloc range check. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-18 19:30:05 +09:00
Paul Mundt	fd37e75ed5	sh64: Set additional fault code values. The SSR.MD status amongst other things are already made available, which can be used for encoding a more precise fault code value. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 17:46:49 +09:00
Paul Mundt	392c3822a6	sh64: Tidy up and consolidate the TLB miss fast path. This unifies the fast-path TLB miss handler, allowing for further cleanup and eventual utilization of a shared _32/_64 handler. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 17:24:21 +09:00
Paul Mundt	2ec08e141f	sh64: Fix up caller-save register settings for fast-path. Now that the fast-path handler has been moved, we also need to update the Makefile to ensure that the same restrictions for caller-save registers are observed. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 16:46:07 +09:00
Paul Mundt	4de5185629	sh64: Invert page fault fast-path error path values. This brings the sh64 version in line with the sh32 one with regards to how errors are handled. Base work for further unification of the implementations. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 16:44:45 +09:00
Paul Mundt	c06fd28387	sh64: Migrate to __update_tlb() API. Now that we have a method for finding out if we're handling an ITLB fault or not without passing it all the way down the chain, it's possible to use the __update_tlb() interface in place of a special __do_tlb_refill(). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 15:52:28 +09:00
Paul Mundt	28080329ed	sh: Enable shared page fault handler for _32/_64. This moves the now generic _32 page fault handling code to a shared place and adapts the _64 implementation to make use of it. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 15:33:28 +09:00
Paul Mundt	e45af0e083	sh64: Kill off unused fixed I/O mapping window. This was reworked some time ago to go through fixmaps instead, leaving the range itself unused. As such, kill off the remaining references and hand over the remaining space for fixmaps directly. This also makes it possible to simplify the vmalloc fault case as we no longer have to care about the special section. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 15:16:11 +09:00
Paul Mundt	20e7c297ef	sh: Ensure fixmap and store queue space can co-exist. At the moment the top of the fixmap space is calculated from P4SEG, which places it at the end of the store queue space when that API is enabled. Make sure we use P3_ADDR_MAX here instead to find the proper address limit. With this done, it's also possible to switch to the generic vmalloc address range check now that VMALLOC_START/END encapsulate the translatable areas that we care about. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 15:11:35 +09:00
Paul Mundt	9a7b7739f9	sh64: Utilize thread fault code encoding. This plugs in fault code encoding for the sh64 page fault, too. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 15:07:52 +09:00
Paul Mundt	5a1dc78a38	sh: Support thread fault code encoding. This provides a simple interface modelled after sparc64/m32r to encode the error code in the upper byte of thread_info for finer-grained handling in the page fault path. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 14:57:28 +09:00
Paul Mundt	dbdb4e9f3f	sh: Tidy up and generalize page fault error paths. This follows the x86 changes for tidying up the page fault error paths. We'll build on top of this for _32/_64 unification. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-05-14 10:27:34 +09:00
Paul Mundt	b2212ea41d	sh64: Kill off unused trap_no/error_code from thread_struct. While the trap number and error code are passed around for debugging purposes, this occurs wholly independently of the thread struct values. These values were never part of the sigcontext ABI and are thus never passed anywhere, so we can just kill them off across the board. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-04-19 17:52:20 +09:00
Paul Mundt	fb56a91922	Merge branches 'sh/st-integration' and 'sh/stackprotector' into sh-latest	2012-04-19 17:31:59 +09:00
Stuart Menefy	45c0e0e25e	sh: Improve oops error reporting In some cases the opps error reporting doesn't give enough information to diagnose the problem, only printing information if it is thought to be valid. Replace the current code with more detailed output. This code is based on the ARM reporting, with minor changes for the SH. [lethal@linux-sh.org: fixed up for 64-bit PTEs and pte_offset_kernel()] Signed-off-by: Stuart Menefy <stuart.menefy@st.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-04-19 17:25:03 +09:00
Stuart Menefy	8d9a784d1e	sh: Fix error synchronising kernel page tables The problem is caused by the interaction of two features in the Linux memory management code. A processes address space is described by a struct mm_struct, and every thread has a pointer to the mm it should run in. The exception to this are kernel threads, which don't have an mm, and so borrow the mm from the last thread which ran. The system is bootstrapped by the initial kernel thread using init's mm (even though init hasn't been created yet, its mm is the static init_mm). The other feature is how the kernel handles the page table which describes the portion of the address space which is only visible when executing inside the kernel, and which is shared by all threads. On the SH4 the only portion of the kernel's address space which described using the page table is called P3, from 0xc0000000 to 0xdfffffff. This portion of the address space is divided into three: - mappings for dma_alloc_coherent() - mappings for vmalloc() and ioremap() - fixmap mappings, primarily used in copy_user_pages() to create kernel mappings of user pages with the correct cache colour. To optimise the TLB miss handler we don't want to add an additional condition which checks whether the faulting address is in the user or the kernel portion of the address space, and so all page tables have a common portion which describes the kernel part of the address space. As the SH4 uses a two level page table, only the kernel portion of first level page table (the pgd entries) is duplicated. These all point to the same second level entries (the pte's), and so no memory is wasted. The reference page table for the kernel is called the swapper_pg_dir, and when a new page table is created for a new process the kernel portion of the page table is copied from swapper_pg_dir. This works fine when changes only occur in the second level of the kernel's page table, or the first level entries are created before any new user processes. However if a change occurs to the first level of the page table, and there are existing processes which don't have this entry in their page table, this new entry needs to be added. This is done on demand, when the kernel accesses a P3 address which isn't mapped using the current page table, the code in vmalloc_fault() copies the entry from the reference page table (swapper_pg_dir) into the current processes page table. The bug which this patch addresses is that the code in vmalloc_fault() was not copying addresses which fell in the dma_alloc_coherent() portion of the address space, and it should have been copying any P3 address. Why we hadn't seen this before, and what made this hard to reproduce, is that normally the kernel will have called dma_alloc_coherent(), and accessed the memory mapping created, before any user process runs. Typically drivers such as USB or SATA will have created and used mappings of this type during the kernel initialisation, when probing for the attached devices, before init runs. Ethernet is slightly different, as it normally only creates and accesses dma_alloc_coherent() mappings when the network is brought up, but if kernel level IP configuration is used this will also occur before any user space process runs. So the first reproduction of this problem which we saw was occurred when USB and SATA were removed from the kernel, and then bring up Ethernet from user space using ifconfig. I'd like to thank Joseph Bormolini who did the hard work reducing the problem to this simple to reproduce criteria. In your case the situation is slightly different, and turns out to depends on the exact kernel configuration (which we had) and your ramdisk contents (which we didn't - hence the need for some assumptions). In this case the problem is a side effect of kernel level module loading. Kernel subsystems sometimes trigger the load of kernel modules directly, for example the crypto subsystem tries to load the cryptomgr and MTD tries to load modules for Flash partitioning if these are not built into the kernel. This is done by the kernel creating a user process which runs insmod to try and load the appropriate module. In order for this to cause problems the system must be running with a initrd or initramfs, which contains an insmod executable - if the kernel can't find an insmod to run, no user process is created, and the problem doesn't occur. If an insmod is found, a process is created to run it, which will inherit the kernel portion of the swapper_pg_dir first level page table. It doesn't matter whether the inmod is successful or not, but when the the kernel scheduler context switches back to the kernel initialisation thread, the insmod's mm is 'borrowed' by the kernel thread, as it doesn't have an address space of its own. (Reference counting is used to ensure this mm is not destroyed, even though the user process which caused its creation may no longer exist.) If this address space doesn't have a first level page table entry for the consistent mappings, and a driver tries to access such a mapping, we are in the same situation as described above, except this time in a kernel thread rather than a user thread executing inside the kernel. See bugzilla: 15425, 15836, 15862, 16106, 16793 Signed-off-by: Stuart Menefy <stuart.menefy@st.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-04-19 15:57:44 +09:00
Paul Mundt	ba2a3cdf76	sh64: Kill off dead page fault debug cruft. In the future we'll be unifying some of the 32/64 page fault path, so start to tidy up the _64 one by killing off some of the unused debug cruft. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-04-11 12:53:06 +09:00
Paul Mundt	a1e2030122	sh64: Port OOM changes to do_page_fault Reflect the sh32 OOM changes for the sh64 page fault handler, too. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-04-11 12:44:50 +09:00
Kautuk Consul	11fd982400	sh/mm/fault_32.c: Port OOM changes to do_page_fault Commit `d065bd810b` (mm: retry page fault when blocking on disk transfer) and commit `37b23e0525` (x86,mm: make pagefault killable) The above commits introduced changes into the x86 pagefault handler for making the page fault handler retryable as well as killable. These changes reduce the mmap_sem hold time, which is crucial during OOM killer invocation. Port these changes to the 32-bit SH platform. Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-04-11 12:37:54 +09:00
Linus Torvalds	664481ed45	SuperH updates for 3.4-rc1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEABECAAYFAk99uBgACgkQGkmNcg7/o7hglwCgqi6CE7i5gyneNYBn2ocRps4O y1UAoMSIscO6YWcHPuxOiNBbJYUy/jMI =SEO8 -----END PGP SIGNATURE----- Merge tag 'sh-for-linus' of git://github.com/pmundt/linux-sh Pull SuperH fixes from Paul Mundt. * tag 'sh-for-linus' of git://github.com/pmundt/linux-sh: sh: fix clock-sh7757 for the latest sh_mobile_sdhi driver serial: sh-sci: use serial_port_in/out vs sci_in/out. sh: vsyscall: Fix up .eh_frame generation. sh: dma: Fix up device attribute mismatch from sysdev fallout. sh: dwarf unwinder depends on SHcompact. sh: fix up fallout from system.h disintegration.	2012-04-07 09:52:46 -07:00
Linus Torvalds	58bca4a8fa	Merge branch 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping Pull DMA mapping branch from Marek Szyprowski: "Short summary for the whole series: A few limitations have been identified in the current dma-mapping design and its implementations for various architectures. There exist more than one function for allocating and freeing the buffers: currently these 3 are used dma_{alloc, free}_coherent, dma_{alloc,free}_writecombine, dma_{alloc,free}_noncoherent. For most of the systems these calls are almost equivalent and can be interchanged. For others, especially the truly non-coherent ones (like ARM), the difference can be easily noticed in overall driver performance. Sadly not all architectures provide implementations for all of them, so the drivers might need to be adapted and cannot be easily shared between different architectures. The provided patches unify all these functions and hide the differences under the already existing dma attributes concept. The thread with more references is available here: http://www.spinics.net/lists/linux-sh/msg09777.html These patches are also a prerequisite for unifying DMA-mapping implementation on ARM architecture with the common one provided by dma_map_ops structure and extending it with IOMMU support. More information is available in the following thread: http://thread.gmane.org/gmane.linux.kernel.cross-arch/12819 More works on dma-mapping framework are planned, especially in the area of buffer sharing and managing the shared mappings (together with the recently introduced dma_buf interface: commit `d15bd7ee44` "dma-buf: Introduce dma buffer sharing mechanism"). The patches in the current set introduce a new alloc/free methods (with support for memory attributes) in dma_map_ops structure, which will later replace dma_alloc_coherent and dma_alloc_writecombine functions." People finally started piping up with support for merging this, so I'm merging it as the last of the pending stuff from the merge window. Looks like pohmelfs is going to wait for 3.5 and more external support for merging. * 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping: common: DMA-mapping: add NON-CONSISTENT attribute common: DMA-mapping: add WRITE_COMBINE attribute common: dma-mapping: introduce mmap method common: dma-mapping: remove old alloc_coherent and free_coherent methods Hexagon: adapt for dma_map_ops changes Unicore32: adapt for dma_map_ops changes Microblaze: adapt for dma_map_ops changes SH: adapt for dma_map_ops changes Alpha: adapt for dma_map_ops changes SPARC: adapt for dma_map_ops changes PowerPC: adapt for dma_map_ops changes MIPS: adapt for dma_map_ops changes X86 & IA64: adapt for dma_map_ops changes common: dma-mapping: introduce generic alloc() and free() methods	2012-04-04 17:13:43 -07:00
Paul Mundt	f03c4866d3	sh: fix up fallout from system.h disintegration. Quite a bit of fallout all over the place, nothing terribly exciting. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-03-30 19:29:57 +09:00
David Howells	e839ca5287	Disintegrate asm/system.h for SH Disintegrate asm/system.h for SH. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-sh@vger.kernel.org	2012-03-28 18:30:03 +01:00
Andrzej Pietrasiewicz	552c0d3ea6	SH: adapt for dma_map_ops changes Adapt core SH architecture code for dma_map_ops changes: replace alloc/free_coherent with generic alloc/free methods. Signed-off-by: Andrzej Pietrasiewicz <andrzej.p@samsung.com> Acked-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Paul Mundt <lethal@linux-sh.org>	2012-03-28 16:36:37 +02:00
Cong Wang	bc3e11be88	sh: remove the second argument of k[un]map_atomic() Signed-off-by: Cong Wang <amwang@redhat.com>	2012-03-20 21:48:15 +08:00
Phil Edworthy	1ae911cba4	sh: Fix sh2a build error for CONFIG_CACHE_WRITETHROUGH Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-02-24 13:21:46 +09:00
Phil Edworthy	c1537b4863	sh: sh2a: Improve cache flush/invalidate functions The cache functions lock out interrupts for long periods; this patch reduces the impact when operating on large address ranges. In such cases it will: - Invalidate the entire cache rather than individual addresses. - Do nothing when flushing the operand cache in write-through mode. - When flushing the operand cache in write-back mdoe, index the search for matching addresses on the cache entires instead of the addresses to flush Note: sh2a__flush_purge_region was only invalidating the operand cache, this adds flush. Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2012-01-12 13:11:02 +09:00
Tejun Heo	0ee332c145	memblock: Kill early_node_map[] Now all ARCH_POPULATES_NODE_MAP archs select HAVE_MEBLOCK_NODE_MAP - there's no user of early_node_map[] left. Kill early_node_map[] and replace ARCH_POPULATES_NODE_MAP with HAVE_MEMBLOCK_NODE_MAP. Also, relocate for_each_mem_pfn_range() and helper from mm.h to memblock.h as page_alloc.c would no longer host an alternative implementation. This change is ultimately one to one mapping and shouldn't cause any observable difference; however, after the recent changes, there are some functions which now would fit memblock.c better than page_alloc.c and dependency on HAVE_MEMBLOCK_NODE_MAP instead of HAVE_MEMBLOCK doesn't make much sense on some of them. Further cleanups for functions inside HAVE_MEMBLOCK_NODE_MAP in mm.h would be nice. -v2: Fix compile bug introduced by mis-spelling CONFIG_HAVE_MEMBLOCK_NODE_MAP to CONFIG_MEMBLOCK_HAVE_NODE_MAP in mmzone.h. Reported by Stephen Rothwell. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: "H. Peter Anvin" <hpa@zytor.com>	2011-12-08 10:22:09 -08:00
Tejun Heo	1aadc0560f	memblock: s/memblock_analyze()/memblock_allow_resize()/ and update users The only function of memblock_analyze() is now allowing resize of memblock region arrays. Rename it to memblock_allow_resize() and update its users. * The following users remain the same other than renaming. arm/mm/init.c::arm_memblock_init() microblaze/kernel/prom.c::early_init_devtree() powerpc/kernel/prom.c::early_init_devtree() openrisc/kernel/prom.c::early_init_devtree() sh/mm/init.c::paging_init() sparc/mm/init_64.c::paging_init() unicore32/mm/init.c::uc32_memblock_init() * In the following users, analyze was used to update total size which is no longer necessary. powerpc/kernel/machine_kexec.c::reserve_crashkernel() powerpc/kernel/prom.c::early_init_devtree() powerpc/mm/init_32.c::MMU_init() powerpc/mm/tlb_nohash.c::__early_init_mmu() powerpc/platforms/ps3/mm.c::ps3_mm_add_memory() powerpc/platforms/embedded6xx/wii.c::wii_memory_fixups() sh/kernel/machine_kexec.c::reserve_crashkernel() * x86/kernel/e820.c::memblock_x86_fill() was directly setting memblock_can_resize before populating memblock and calling analyze afterwards. Call memblock_allow_resize() before start populating. memblock_can_resize is now static inside memblock.c. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: "H. Peter Anvin" <hpa@zytor.com>	2011-12-08 10:22:08 -08:00
Tejun Heo	fe091c208a	memblock: Kill memblock_init() memblock_init() initializes arrays for regions and memblock itself; however, all these can be done with struct initializers and memblock_init() can be removed. This patch kills memblock_init() and initializes memblock with struct initializer. The only difference is that the first dummy entries don't have .nid set to MAX_NUMNODES initially. This doesn't cause any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: "H. Peter Anvin" <hpa@zytor.com>	2011-12-08 10:22:07 -08:00
Linus Torvalds	32aaeffbd4	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h	2011-11-06 19:44:47 -08:00
Paul Gortmaker	f7be345515	sh: Add export.h to arch/sh specific files as required. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2011-10-31 19:31:05 -04:00
Simon Horman	e66ac3f26a	sh: kexec: Add PHYSICAL_START Add PHYSICAL_START kernel configuration parameter to set the address at which the kernel should be loaded. It has been observed on an sh7757lcr that simply modifying MEMORY_START does not achieve this goal for 32bit sh. This is due to MEMORY_OFFSET in arch/sh/kernel/vmlinux.lds.S bot being based on MEMORY_START on such systems. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2011-10-28 15:03:43 +09:00
Linus Torvalds	4d4abdcb1d	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (123 commits) perf: Remove the nmi parameter from the oprofile_perf backend x86, perf: Make copy_from_user_nmi() a library function perf: Remove perf_event_attr::type check x86, perf: P4 PMU - Fix typos in comments and style cleanup perf tools: Make test use the preset debugfs path perf tools: Add automated tests for events parsing perf tools: De-opt the parse_events function perf script: Fix display of IP address for non-callchain path perf tools: Fix endian conversion reading event attr from file header perf tools: Add missing 'node' alias to the hw_cache[] array perf probe: Support adding probes on offline kernel modules perf probe: Add probed module in front of function perf probe: Introduce debuginfo to encapsulate dwarf information perf-probe: Move dwarf library routines to dwarf-aux.{c, h} perf probe: Remove redundant dwarf functions perf probe: Move strtailcmp to string.c perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END tracing/kprobe: Update symbol reference when loading module tracing/kprobes: Support module init function probing kprobes: Return -ENOENT if probe point doesn't exist ...	2011-07-22 16:44:39 -07:00
Peter Zijlstra	a8b0ca17b8	perf: Remove the nmi parameter from the swevent and overflow interface The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:35 +02:00
Paul Mundt	9ab3a15d95	sh: use printk_ratelimited instead of printk_ratelimit Follows the powerpc change, for much the same rationale. Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2011-06-30 15:10:06 +09:00

1 2 3 4 5 ...

564 Commits