Just make it depend on BROKEN for now, in case people scream really loud
about it (and because we might want to keep some of this logic for an
upcoming BIOS workaround, so I don't just want to rip it out entirely
just yet). But for graphics devices, it really ought to be unnecessary.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The following 64 bit promotions are necessary to handle memory above the
4GiB boundary correctly.
[dwmw2: Fix the second part not to need 64-bit arithmetic at all]
Signed-off-by: Benjamin LaHaise <ben.lahaise@neterion.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
If end_pfn is equal to (unsigned long)-1, then the loop will never end.
Seen on 32-bit kernel, but could have happened on 64-bit too once we get
hardware that supports 64-bit guest addresses.
Change both functions to a 'do {} while' loop with the test at the end,
and check for the PFN having wrapper round to zero.
Reported-by: Benjamin LaHaise <ben.lahaise@neterion.com>
Tested-by: Benjamin LaHaise <ben.lahaise@neterion.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This means we're limited to 44-bit addresses on 32-bit kernels, and
makes it sane for us to use 'unsigned long' for PFNs throughout.
Which is just as well, really, since we already do that.
Reported-by: Benjamin LaHaise <ben.lahaise@neterion.com>
Tested-by: Benjamin LaHaise <ben.lahaise@neterion.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
BIOS clear DMAR table INTR_REMAP flag to disable interrupt remapping. Current
kernel only check interrupt remapping(IR) flag in DRHD's extended capability
register to decide interrupt remapping support or not. But IR flag will not
change when BIOS disable/enable interrupt remapping.
When user disable interrupt remapping in BIOS or BIOS often defaultly disable
interrupt remapping feature when BIOS is not mature.Though BIOS disable
interrupt remapping but intr_remapping_supported function will always report
to OS support interrupt remapping if VT-d2 chipset populated. On this
cases, kernel will continue enable interrupt remapping and result kernel panic.
This bug exist on almost all platforms with interrupt remapping support.
This patch add DMAR table INTR_REMAP flag check before enable interrupt
remapping.
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Current kernel enable interrupt remapping only when all the vt-d unit support
interrupt remapping. So it is reasonable we should also disallow enabling
intr-remapping if there any io-apics that are not listed under vt-d units.
Otherwise we can run into issues.
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
This file needs to include linux/dmi.h directly rather than relying on
it being pulled in from elsewhere.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
DMAR faults are recorded into a ring of "fault recording registers".
fault_index is a 0-based index into the ring. The code allows the
0-based fault_index to be equal to the total number of fault registers
available from the cap_num_fault_regs() macro, which causes access
beyond the last available register.
Signed-off-by Troy Heber <troy.heber@hp.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The kcalloc() failure path in iommu_init_domains() calls
free_dmar_iommu(), which assumes that ->domains, ->domain_ids,
and ->lock have been properly initialized.
Add checks in free_[dmar]_iommu to not use ->domains,->domain_ids
if not alloced. Move the lock init to prior to the kcalloc()'s,
so it is valid in free_context_table() when free_dmar_iommu() invokes
it at the end.
Patch based on iommu-2.6,
commit 132032274a
Signed-off-by: Donald Dutile <ddutile@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Mark si_domain_init and iommu_prepare_static_identity_mapping with
__init, to eliminate the following warnings:
WARNING: drivers/pci/built-in.o(.text+0xf1f4): Section mismatch in reference from the function si_domain_init() to the function .init.text:si_domain_work_fn()
The function si_domain_init() references
the function __init si_domain_work_fn().
This is often because si_domain_init lacks a __init
annotation or the annotation of si_domain_work_fn is wrong.
WARNING: drivers/pci/built-in.o(.text+0xe340): Section mismatch in reference from the function iommu_prepare_static_identity_mapping() to the function .init.text:si_domain_init()
The function iommu_prepare_static_identity_mapping() references
the function __init si_domain_init().
This is often because iommu_prepare_static_identity_mapping lacks a __init
annotation or the annotation of si_domain_init is wrong.
Signed-off-by: Matt Kraai <kraai@ftbfs.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
We are seeing a number of crashes in SMM, when VT-d is enabled while
'Legacy USB support' is enabled in various BIOSes.
The BIOS is supposed to indicate which addresses it uses for DMA in a
special ACPI table ("RMRR"), so that we can punch a hole for it when we
set up the IOMMU.
The problem is, as usual, that BIOS engineers are totally incompetent.
They write code which will crash if the DMA goes AWOL, and then they
either neglect to provide an RMRR table at all, or they put the wrong
addresses in it. And of course they don't do _any_ QA, since that would
take too much time away from their crack-smoking habit.
The real fix, of course, is for consumers to refuse to buy motherboards
which only have closed-source firmware available. If we had _open_
firmware, bugs like this would be easy to fix.
Since that's something I can only dream about, this patch implements an
alternative -- ensuring that the USB controllers are handed off from the
BIOS and quiesced _before_ the IOMMU is initialised. That would have
been a much better design than this RMRR nonsense in the first place, of
course. The bootloader has no business doing DMA after the OS has booted
anyway.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Since commit 19943b0e30 ('intel-iommu:
Unify hardware and software passthrough support'), hardware passthrough
mode will do the same as software passthrough mode was doing -- it'll
still use the IOMMU normally for devices which can't address all of
memory. This means that we don't need to bother with swiotlb.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
All callers of the former were also calling the latter, in one order or
the other, and failing to correctly clean up if the second returned
failure.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf_counter: Fix double list iteration in per task precise stats
perf: Auto-detect libelf
perf symbol: Fix symbol parsing in certain cases: use the build-id as a symlink
perf_counter/powerpc: Check oprofile_cpu_type for NULL before using it
ftrace: Fix perf-tracepoint OOPS
perf report: Add missing command line options to man page
perf: Auto-detect libbfd
perf report: Make --sort comm,dso,symbol the default
* git://git.infradead.org/mtd-2.6:
jffs2: Fix return value from jffs2_do_readpage_nolock()
mtd: mtdblock: introduce mtdblks_lock
mtd: remove 'SBC8240 Wind River' Device Driver Code
mtd: OneNAND: OMAP2/3: free GPMC CS on module removal
mtd: OneNAND: fix incorrect bufferram offset
mtd: blkdevs: do not forget to get MTD devices
mtd: fix the conversion from dev to mtd_info
mtd: let include/linux/mtd/partitions.h stand on its own
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon/kms: setup MC/VRAM the same way for suspend/resume
drm/radeon/kms: Fix caching mode selection for GTT object
The new credentials code broke load_flat_shared_library() as it now uses
an uninitialized cred pointer.
Reported-by: Bernd Schmidt <bernds_cb1@t-online.de>
Tested-by: Bernd Schmidt <bernds_cb1@t-online.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These includes were added by 079effb693
("kmemtrace, kbuild: fix slab.h dependency problem in
lib/decompress_inflate.c") to fix the build when using kmemtrace. However
this is not necessary when used to create a compressed kernel, and
actually creates issues (brings a lot of things unavailable in the
decompression environment), so don't include it if STATIC is defined.
Signed-off-by: Albin Tonnerre <albin.tonnerre@free-electrons.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
decompress_bunzip2 and decompress_unlzma have a nasty hack that subtracts
4 from the input length if being called in the pre-boot environment.
This is a nasty hack because it relies on the fact that flush = NULL only
when called from the pre-boot environment (i.e.
arch/x86/boot/compressed/misc.c). initramfs.c/do_mounts_rd.c pass in a
flush buffer (flush != NULL).
This hack prevents the decompressors from being used with flush = NULL by
other callers unless knowledge of the hack is propagated to them.
This patch removes the hack by making decompress (called only from the
pre-boot environment) a wrapper function that subtracts 4 from the input
length before calling the decompressor.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix and improve comments in decompress/generic.h that describe the
decompressor API. Also remove an unused definition, and rename INBUF_LEN
in lib/decompress_inflate.c to conform to bzip2/lzma naming.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While looking at Jens Rosenboom bug report
(http://lkml.org/lkml/2009/7/27/35) about strange sys_futex call done from
a dying "ps" program, we found following problem.
clone() syscall has special support for TID of created threads. This
support includes two features.
One (CLONE_CHILD_SETTID) is to set an integer into user memory with the
TID value.
One (CLONE_CHILD_CLEARTID) is to clear this same integer once the created
thread dies.
The integer location is a user provided pointer, provided at clone()
time.
kernel keeps this pointer value into current->clear_child_tid.
At execve() time, we should make sure kernel doesnt keep this user
provided pointer, as full user memory is replaced by a new one.
As glibc fork() actually uses clone() syscall with CLONE_CHILD_SETTID and
CLONE_CHILD_CLEARTID set, chances are high that we might corrupt user
memory in forked processes.
Following sequence could happen:
1) bash (or any program) starts a new process, by a fork() call that
glibc maps to a clone( ... CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID
...) syscall
2) When new process starts, its current->clear_child_tid is set to a
location that has a meaning only in bash (or initial program) context
(&THREAD_SELF->tid)
3) This new process does the execve() syscall to start a new program.
current->clear_child_tid is left unchanged (a non NULL value)
4) If this new program creates some threads, and initial thread exits,
kernel will attempt to clear the integer pointed by
current->clear_child_tid from mm_release() :
if (tsk->clear_child_tid
&& !(tsk->flags & PF_SIGNALED)
&& atomic_read(&mm->mm_users) > 1) {
u32 __user * tidptr = tsk->clear_child_tid;
tsk->clear_child_tid = NULL;
/*
* We don't check the error code - if userspace has
* not set up a proper pointer then tough luck.
*/
<< here >> put_user(0, tidptr);
sys_futex(tidptr, FUTEX_WAKE, 1, NULL, NULL, 0);
}
5) OR : if new program is not multi-threaded, but spied by /proc/pid
users (ps command for example), mm_users > 1, and the exiting program
could corrupt 4 bytes in a persistent memory area (shm or memory mapped
file)
If current->clear_child_tid points to a writeable portion of memory of the
new program, kernel happily and silently corrupts 4 bytes of memory, with
unexpected effects.
Fix is straightforward and should not break any sane program.
Reported-by: Jens Rosenboom <jens@mcbone.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@us.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sdhci_alloc_host returns an ERR_PTR value in an error case instead of NULL.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@match exists@
expression x, E;
statement S1, S2;
@@
x = sdhci_alloc_host(...)
... when != x = E
(
* if (x == NULL || ...) S1 else S2
|
* if (x == NULL && ...) S1 else S2
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Matt Fleming <matt@console-pimps.org>
Cc: Ian Molton <ian@mnementh.co.uk>
Cc: "Roberto A. Foglietta" <roberto.foglietta@gmail.com>
Cc: Philip Langdale <philipl@overt.org>
Cc: Pierre Ossman <pierre@ossman.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Recent framebuffer locking patches first made affected systems unbootable,
then the dead-lock has been fixed but as of 2.6.31-rc4 the framebuffer on
mx3 machines doesn't work. Fix this.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I suspect that mnt_want_write_file() may have wrong assumption. I think
mnt_want_write_file() is assuming it increments ->mnt_writers if
(file->f_mode & FMODE_WRITE). But, if it's special_file(), it is false?
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The FIEMAP_IOC_FIEMAP mapping ioctl was missing a 32-bit compat handler,
which means that 32-bit suerspace on 64-bit kernels cannot use this ioctl
command.
The structure is nicely aligned, padded, and sized, so it is just this
simple.
Tested w/ 32-bit ioctl tester (from Josef) on a 64-bit kernel on ext4.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: Mark Lord <lkml@rtr.ca>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Josef Bacik <josef@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Catalin and kmemleak spotted a leak of a VC screen buffer in
vc_allocate() due to the following chain of events:
vc_allocate()
visual_init(init=1)
vc->vc_sw->con_init(init=1)
fbcon_init()
vc_resize()
vc->screen_buf = kmalloc()
vc->screen_buf = kmalloc()
The common way for the VC drivers is to set the screen dimension
parameters manually in the init case and only call vc_resize() for
!init - which allocates a screen buffer according to the new
dimensions.
fbcon instead would do vc_resize() unconditionally and afterwards set
the dimensions manually (again) for !init - i.e. completely upside
down. The vc_resize() allocated buffer would then get lost by
vc_allocate() allocating a fresh one.
Use vc_resize() only for actual resizing to close the leak.
Set the dimensions manually only in initialization mode to remove the
redundant setting in resize mode.
The kmemleak trace from Catalin:
unreferenced object 0xde158000 (size 12288):
comm "Xorg", pid 1439, jiffies 4294961016
hex dump (first 32 bytes):
20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 . . . . . . . .
20 00 20 00 20 00 20 00 20 00 20 00 20 00 20 00 . . . . . . . .
backtrace:
[<c006f74b>] __save_stack_trace+0x17/0x1c
[<c006f81d>] create_object+0xcd/0x188
[<c01f5457>] kmemleak_alloc+0x1b/0x3c
[<c006e303>] __kmalloc+0xdb/0xe8
[<c012cc4b>] vc_do_resize+0x73/0x1e0
[<c012cdf1>] vc_resize+0x15/0x18
[<c011afc1>] fbcon_init+0x1f9/0x2b8
[<c0129e87>] visual_init+0x9f/0xdc
[<c012aff3>] vc_allocate+0x7f/0xfc
[<c012b087>] con_open+0x17/0x80
[<c0120e43>] tty_open+0x1f7/0x2e4
[<c0072fa1>] chrdev_open+0x101/0x118
[<c006ffad>] __dentry_open+0x105/0x1cc
[<c00700fd>] nameidata_to_filp+0x2d/0x38
[<c00788cd>] do_filp_open+0x2c1/0x54c
[<c006fdff>] do_sys_open+0x3b/0xb4
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Tested-by: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes a bug caused by changing pointers (viafb_mode, viafb_mode1)
assigned by module_param. It reduces driver complexity by not needlessly
changing these vars as they are only read once and removing now
superfluous code.
On unpatched kernels loading viafb with viafb_mode or viafb_mode1 option
used and afterwards unloading it results in:
kernel BUG at mm/slub.c:2926!
invalid opcode: 0000 [#1] PREEMPT
last sysfs file: /sys/devices/virtual/block/loop0/removable
Modules linked in: snd_hda_codec_realtek snd_hda_intel snd_hda_codec
snd_hwdep snd_pcm rtl8187 snd_timer eeprom_93cx6 mmc_block snd soundcore
via_sdmmc fb snd_page_alloc i2c_algo_bit i2c_viapro ehci_hcd uhci_hcd
cfbcopyarea mmc_core cfbimgblt cfbfillrect video output [last unloaded:
viafb]
Pid: 3355, comm: rmmod Not tainted (2.6.31-rc1 #0)
EIP: 0060:[<c106a759>] EFLAGS: 00010246 CPU: 0
EIP is at kfree+0x80/0xda
EAX: c17c2da0 EBX: dc7edbdc ECX: 0000010f EDX: 00000000
ESI: c102c700 EDI: dc7ed8fa EBP: d703ff2c ESP: d703ff20
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process rmmod (pid: 3355, ti=d703e000 task=db1412c0 task.ti=d703e000)
Stack:
dc7edbdc 00000014 00000016 d703ff40 c102c700 dc7f45d4 dc7f45d4 00000880
d703ff4c c103e571 00000000 d703ffac c103e751 66616976 da140062 db89ba80
00000328 d702edf8 db89ba80 d703ff9c c105d0f0 00000200 da14f898 00000014
Call Trace:
[<c102c700>] ? destroy_params+0x1e/0x2b
[<c103e571>] ? free_module+0xa2/0xd7
[<c103e751>] ? sys_delete_module+0x1ab/0x1da
[<c105d0f0>] ? do_munmap+0x20a/0x225
[<c10029b4>] ? sysenter_do_call+0x12/0x26
Code: 10 76 7a 8d 87 00 00 00 40 c1 e8 0c c1 e0 05 03 05 1c 87 41 c1 66 83 38 00 79 03 8b 40 0c 8b 10 84 d2 78 12 66 f7 c2 00 c0 75 04 <0f> 0b eb fe e8 6f 5a fe ff eb 47 8b 55 04 8b 58 0c 9c 5e fa 3b
EIP: [<c106a759>] kfree+0x80/0xda SS:ESP 0068:d703ff20
This is caused by the current code changing the pointers assigned by
module_param. During unload it tries to free the memory the pointers
point at which is now part of an internal structure.
The patch simply avoids changing the pointers. This is okay as they are
read only once during the initialization process.
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Scott Fang <ScottFang@viatech.com.cn>
Cc: Joseph Chan <JosephChan@via.com.tw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
At first, init_task's mems_allowed is initialized as this.
init_task->mems_allowed == node_state[N_POSSIBLE]
And cpuset's top_cpuset mask is initialized as this
top_cpuset->mems_allowed = node_state[N_HIGH_MEMORY]
Before 2.6.29:
policy's mems_allowed is initialized as this.
1. update tasks->mems_allowed by its cpuset->mems_allowed.
2. policy->mems_allowed = nodes_and(tasks->mems_allowed, user's mask)
Updating task's mems_allowed in reference to top_cpuset's one.
cpuset's mems_allowed is aware of N_HIGH_MEMORY, always.
In 2.6.30: After commit 58568d2a82
("cpuset,mm: update tasks' mems_allowed in time"), policy's mems_allowed
is initialized as this.
1. policy->mems_allowd = nodes_and(task->mems_allowed, user's mask)
Here, if task is in top_cpuset, task->mems_allowed is not updated from
init's one. Assume user excutes command as #numactrl --interleave=all
,....
policy->mems_allowd = nodes_and(N_POSSIBLE, ALL_SET_MASK)
Then, policy's mems_allowd can includes a possible node, which has no pgdat.
MPOL's INTERLEAVE just scans nodemask of task->mems_allowd and access this
directly.
NODE_DATA(nid)->zonelist even if NODE_DATA(nid)==NULL
Then, what's we need is making policy->mems_allowed be aware of
N_HIGH_MEMORY. This patch does that. But to do so, extra nodemask will
be on statck. Because I know cpumask has a new interface of
CPUMASK_ALLOC(), I added it to node.
This patch stands on old behavior. But I feel this fix itself is just a
Band-Aid. But to do fundametal fix, we have to take care of memory
hotplug and it takes time. (task->mems_allowd should be N_HIGH_MEMORY, I
think.)
mpol_set_nodemask() should be aware of N_HIGH_MEMORY and policy's nodemask
should be includes only online nodes.
In old behavior, this is guaranteed by frequent reference to cpuset's
code. Now, most of them are removed and mempolicy has to check it by
itself.
To do check, a few nodemask_t will be used for calculating nodemask. But,
size of nodemask_t can be big and it's not good to allocate them on stack.
Now, cpumask_t has CPUMASK_ALLOC/FREE an easy code for get scratch area.
NODEMASK_ALLOC/FREE shoudl be there.
[akpm@linux-foundation.org: cleanups & tweaks]
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Paul Menage <menage@google.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: David Rientjes <rientjes@google.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the rotate_ud() function not to crash in case of a font which has not
a width of multiple by 8: The inner loop of the font pixel copy should not
access a bit outside the font memory area. Subtract the shift offset from
the font width will prevent this.
Signed-off-by: Stefani Seibold <stefani@seibold.net>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use CONFIG_HOTPLUG_CPU, not CONFIG_CPU_HOTPLUG
When hot-unpluging a cpu, it will leak memory allocated at cpu hotplug,
but only if CPUMASK_OFFSTACK=y, which is default to n.
The bug was introduced by 8969a5ede0
("generic-ipi: remove kmalloc()").
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This was found using a semantic patch, more info can be found at:
http://www.emn.fr/x-info/coccinelle/
Signed-off-by: Stoyan Gaydarov <sgayda2@uiuc.edu>
Acked-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
we should align the GTT after VRAM no matter what, as we can
come back from resume and put in a different place and bad things happen.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Brice Goglin reported this crash with per task precise stats:
> I finally managed to test the threaded perfcounter statistics (thanks a
> lot for implementing it). I am running 2.6.31-rc5 (with the AMD
> magny-cours patches but I don't think they matter here). I am trying to
> measure local/remote memory accesses per thread during the well-known
> stream benchmark. It's compiled with OpenMP using 16 threads on a
> quad-socket quad-core barcelona machine.
>
> Command line is:
> /mnt/scratch/bgoglin/cpunode/linux-2.6.31/tools/perf/perf record -f -s
> -e r1000001e0 -e r1000002e0 -e r1000004e0 -e r1000008e0 ./stream
>
> It seems to work fine with a single -e <counter> on the command line
> while it crashes when there are at least 2 of them.
> It seems to work fine without -s as well.
A silly copy-paste resulted in a messed up iteration which would
cause the OOPS.
Reported-by: Brice Goglin <Brice.Goglin@inria.fr>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Brice Goglin <Brice.Goglin@inria.fr>
LKML-Reference: <1249574786.32113.550.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adds autodetection for libelf as well, and simplifies the
libbfd code. Furthermore, fail make with an error when libelf
is not found and warn about the lack of libbfd.
Also provide an option to build a 32bit version even though you
might be running a 64bit kernel.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In some cases distros have binaries and debuginfo in weird places:
[root@doppio tuna]# ls -la /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
-rwxr-xr-x 1 root root 90024 2009-08-03 19:45 /usr/lib64/firefox-3.5.2/firefox
-rwxr-xr-x 1 root root 90024 2009-08-03 18:23 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
[root@doppio tuna]# sha1sum /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
19a858077d263d5de22c9c5da250d3e4396ae739 /usr/lib64/xulrunner-1.9.1/xulrunner-stub
19a858077d263d5de22c9c5da250d3e4396ae739 /usr/lib64/firefox-3.5.2/firefox
[root@doppio tuna]# rpm -qf /usr/lib64/{xulrunner-1.9.1/xulrunner-stub,firefox-3.5.2/firefox}
xulrunner-1.9.1.2-1.fc11.x86_64
firefox-3.5.2-2.fc11.x86_64
[root@doppio tuna]# ls -la /usr/lib/debug/{usr/lib64/xulrunner-1.9.1/xulrunner-stub,usr/lib64/firefox-3.5.2/firefox}.debug
ls: cannot access /usr/lib/debug/usr/lib64/firefox-3.5.2/firefox.debug: No such file or directory
-rwxr-xr-x 1 root root 403608 2009-08-03 18:22 /usr/lib/debug/usr/lib64/xulrunner-1.9.1/xulrunner-stub.debug
Seemingly we don't have a .symtab when we actually can find it
if we use the .note.gnu.build-id ELF section put in place by
some distros. Use it and find the symbols we need.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If the current CPU doesn't support performance counters,
cur_cpu_spec->oprofile_cpu_type can be NULL. The current
perf_counter modules don't test for that case and would thus
crash at boot time.
Bug reported by David Woodhouse.
Reported-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Paul Mackerras <paulus@samba.org>
LKML-Reference: <19066.48028.446975.501454@cargo.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Two defects work together result in KVM device passthrough randomly can't
work:
1. iommu_snooping is not initialized to zero when vm_iommu_init() called.
So it is possible to get a random value.
2. One line added by commit 2c2e2c38("IOMMU Identity Mapping Support")
change the code path, let it bypass domain_update_iommu_cap(), as well as
missing the increment of domain iommu reference count.
The latter is also likely to cause a leak of domains on repeated VMM
assignment and deassignment.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Remove assumption on the shift and size of rows/columns form
matrix_keypad driver.
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
The Prestigio 157, an old no-name clone laptop uses input keys very
similar to the Wistron 1557/MS2141 with the addition of BIOS-controlled
wireless radio frequency kill switch.
This patch adds support for the RF kill switch control and adds manual
identification of the model.
The Prestigio does not expose any recognisable identity via dmidecode
and so requires manual selection at module init using
force=1 keymap=prestigio
Signed-off-by: TJ <ubuntu@tjworld.net>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
GTT object can either be cached,uncached or wc just let core ttm
pick the best mode according to how the bo driver and GTT memory
type was initialized.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Not all tracepoints are created equal, in specific the ftrace
tracepoints are created with TRACE_EVENT_FORMAT() which does
not generate the needed bits to tie them into perf counters.
For those events, don't create the 'id' file and fail
->profile_enable when their ID is specified through other
means.
Reported-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <1249497664.5890.4.camel@laptop>
[ v2: fix build error in the !CONFIG_EVENT_PROFILE case ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since the C++ demangling isn't needed for everybody and
bfd/iberty aren't widely/easily available on all machines, make
it optional.
It also allows you to forcefully disable demangling by using
NO_DEMANGLE=1 and otherwise tries to detect libbfd/libiberty
combinations that result in a compiling demangler.
Reported-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
LKML-Reference: <20090801082048.GX12579@kernel.dk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If you're doing performance testing, you're interested in the
symbols anyway so lets make "--sort comm,dso,symbol" the
default sort option.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: acme@redhat.com
LKML-Reference: <1249467921-10450-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The physical address passed to domain_pfn_mapping() should be rounded
down to the start of the MM page, not the VT-d page.
This issue causes kernel panic on PAGE_SIZE>VTD_PAGE_SIZE platforms e.g. ia64
platforms.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
In domain_sg_mapping(), use aligned_nrpages() instead of hand-coded
rounding code for calculating the size of each sg elem. This means that
on IA64 we correctly round up to the MM page size, not just to the VT-d
page size.
Also remove the incorrect mm_to_dma_pfn() when intel_map_sg() calls
domain_sg_mapping() -- the 'size' variable is in VT-d pages already.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>