* git://oss.sgi.com:8090/xfs/linux-2.6: (45 commits)
[XFS] Fix use after free in xfs_log_done().
[XFS] Make xfs_bmap_*_count_leaves void.
[XFS] Use KM_NOFS for debug trace buffers
[XFS] use KM_MAYFAIL in xfs_mountfs
[XFS] refactor xfs_mount_free
[XFS] don't call xfs_freesb from xfs_unmountfs
[XFS] xfs_unmountfs should return void
[XFS] cleanup xfs_mountfs
[XFS] move root inode IRELE into xfs_unmountfs
[XFS] stop using file_update_time
[XFS] optimize xfs_ichgtime
[XFS] update timestamp in xfs_ialloc manually
[XFS] remove the sema_t from XFS.
[XFS] replace dquot flush semaphore with a completion
[XFS] replace inode flush semaphore with a completion
[XFS] extend completions to provide XFS object flush requirements
[XFS] replace the XFS buf iodone semaphore with a completion
[XFS] clean up stale references to semaphores
[XFS] use get_unaligned_* helpers
[XFS] Fix compile failure in xfs_buf_trace()
...
Done as a script (well, a single "git mv" actually) on request from
Yoshinori Sato as a way to avoid a huge diff.
Requested-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wolfgang Walter reported this oops on his via C3 using padlock for
AES-encryption:
##################################################################
BUG: unable to handle kernel NULL pointer dereference at 000001f0
IP: [<c01028c5>] __switch_to+0x30/0x117
*pde = 00000000
Oops: 0002 [#1] PREEMPT
Modules linked in:
Pid: 2071, comm: sleep Not tainted (2.6.26 #11)
EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0
EIP is at __switch_to+0x30/0x117
EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300
ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000)
Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046
c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000
c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0
Call Trace:
[<c03b5b43>] ? schedule+0x285/0x2ff
[<c0131856>] ? pm_qos_requirement+0x3c/0x53
[<c0239f54>] ? acpi_processor_idle+0x0/0x434
[<c01025fe>] ? cpu_idle+0x73/0x7f
[<c03a4dcd>] ? rest_init+0x61/0x63
=======================
Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end()
around the padlock instructions fix the oops.
Suresh wrote:
These padlock instructions though don't use/touch SSE registers, but it behaves
similar to other SSE instructions. For example, it might cause DNA faults
when cr0.ts is set. While this is a spurious DNA trap, it might cause
oops with the recent fpu code changes.
This is the code sequence that is probably causing this problem:
a) new app is getting exec'd and it is somewhere in between
start_thread() and flush_old_exec() in the load_xyz_binary()
b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is
cleared.
c) Now we get an interrupt/softirq which starts using these encrypt/decrypt
routines in the network stack. This generates a math fault (as
cr0.ts is '1') which sets TS_USEDFPU and restores the math that is
in the task's xstate.
d) Return to exec code path, which does start_thread() which does
free_thread_xstate() and sets xstate pointer to NULL while
the TS_USEDFPU is still set.
e) At the next context switch from the new exec'd task to another task,
we have a scenarios where TS_USEDFPU is set but xstate pointer is null.
This can cause an oops during unlazy_fpu() in __switch_to()
Now:
1) This should happen with or with out pre-emption. Viro also encountered
similar problem with out CONFIG_PREEMPT.
2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because
kernel_fpu_begin() will manually do a clts() and won't run in to the
situation of setting TS_USEDFPU in step "c" above.
3) This was working before the fpu changes, because its a spurious
math fault which doesn't corrupt any fpu/sse registers and the task's
math state was always in an allocated state.
With out the recent lazy fpu allocation changes, while we don't see oops,
there is a possible race still present in older kernels(for example,
while kernel is using kernel_fpu_begin() in some optimized clear/copy
page and an interrupt/softirq happens which uses these padlock
instructions generating DNA fault).
This is the failing scenario that existed even before the lazy fpu allocation
changes:
0. CPU's TS flag is set
1. kernel using FPU in some optimized copy routine and while doing
kernel_fpu_begin() takes an interrupt just before doing clts()
2. Takes an interrupt and ipsec uses padlock instruction. And we
take a DNA fault as TS flag is still set.
3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts
4. We complete the padlock routine
5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes
the optimized copy routine and does kernel_fpu_end(). At this point,
we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll
set and not cleared.
6. Now kernel resumes its user operation. And at the next context
switch, kernel sees it has do a FP save as TS_USEDFPU is still set
and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu()
will take a DNA fault, as cr0.ts is '1' and now, because we are
in __switch_to(), math_state_restore() will get confused and will
restore the next task's FP state and will save it in prev tasks's FP state.
Remember, in __switch_to() we are already on the stack of the next task
but take a DNA fault for the prev task.
This causes the fpu leakage.
Fix the padlock instruction usage by calling them inside the
context of new routines irq_ts_save/restore(), which clear/restore cr0.ts
manually in the interrupt context. This will not generate spurious DNA
in the context of the interrupt which will fix the oops encountered and
the possible FPU leakage issue.
Reported-and-bisected-by: Wolfgang Walter <wolfgang.walter@stwm.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commit 74768ed833 "page allocator: use no-panic variant of
alloc_bootmem() in alloc_large_system_hash()" introduced two new
_nopanic macros which are undefined for CONFIG_HAVE_ARCH_BOOTMEM_NODE.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: "Jan Beulich" <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch removes ip6_prohibit_entry and ip6_blk_hole_entry
declarations from include/net/ip6_route.h as they are unused.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes rt6_lock declaration from include/net/ip6_route.h
as it is unused.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based upon a bug report by Andrew Gallatin on netdev
with subject "CPU utilization increased in 2.6.27rc"
In commit 37437bb2e1
("pkt_sched: Schedule qdiscs instead of netdev_queue.")
the test of the queue being stopped was erroneously
removed from qdisc_run().
When the TX queue of the device fills up, this omission
causes lots of extraneous useless work to be queued up
to softirq context, where we'll just return immediately
because the device is still stuffed up.
Signed-off-by: David S. Miller <davem@davemloft.net>
XFS object flushing doesn't quite match existing completion semantics. It
mixed exclusive access with completion. That is, we need to mark an object as
being flushed before flushing it to disk, and then block any other attempt to
flush it until the completion occurs. We do this but adding an extra count to
the completion before we start using them. However, we still need to
determine if there is a completion in progress, and allow no-blocking attempts
fo completions to decrement the count.
To do this we introduce:
int try_wait_for_completion(struct completion *x)
returns a failure status if done == 0, otherwise decrements done
to zero and returns a "started" status. This is provided
to allow counted completions to begin safely while holding
object locks in inverted order.
int completion_done(struct completion *x)
returns 1 if there is no waiter, 0 if there is a waiter
(i.e. a completion in progress).
This replaces the use of semaphores for providing this exclusion
and completion mechanism.
SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31816a
Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Various cleanup the drivers/firmware/memmap (after review by AKPM):
- fix kdoc to conform to the standard
- move kdoc from header to implementation files
- remove superfluous WARN_ON() after kmalloc()
- WARN_ON(x); if (!x) -> if(!WARN_ON(x))
- improve some comments
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The attached patch seems to already exist in a number of branches -- it
keeps popping up on Google for me, and is certainly already in Debian --
but is strangely absent from mainstream.
The problem appears to be that the patched file ends up as part of the
target toolchain, but unfortunately the gcc constant folding doesn't
appear to eliminate the __invalid_size_argument_for_IOC value early
enough. Certainly compiling C++ programs which use _IO... macros as
constants fails without this patch.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Collect the implementations from include/linux/byteorder/swab.h, swabb.h
in swab.h
The functionality provided covers:
u16 swab16(u16 val) - return a byteswapped 16 bit value
u32 swab32(u32 val) - return a byteswapped 32 bit value
u64 swab64(u64 val) - return a byteswapped 64 bit value
u32 swahw32(u32 val) - return a wordswapped 32 bit value
u32 swahb32(u32 val) - return a high/low byteswapped 32 bit value
Similar to above, but return swapped value from a naturally-aligned pointer
u16 swab16p(u16 *p)
u32 swab32p(u32 *p)
u64 swab64p(u64 *p)
u32 swahw32p(u32 *p)
u32 swahb32p(u32 *p)
Similar to above, but swap the value in-place (in-situ)
void swab16s(u16 *p)
void swab32s(u32 *p)
void swab64s(u64 *p)
void swahw32s(u32 *p)
void swahb32s(u32 *p)
Arches can override any of these with an optimized version by defining an
inline in their asm/byteorder.h (example given for swab16()):
u16 __arch_swab16() {}
#define __arch_swab16 __arch_swab16
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Short enough reads from /proc/irq/*/smp_affinity return -EINVAL for no
good reason.
This became noticed with NR_CPUS=4096 patches, when length of printed
representation of cpumask becase 1152, but cat(1) continued to read with
1024-byte chunks. bitmap_scnprintf() in good faith fills buffer, returns
1023, check returns -EINVAL.
Fix it by switching to seq_file, so handler will just fill buffer and
doesn't care about offsets, length, filling EOF and all this crap.
For that add seq_bitmap(), and wrappers around it -- seq_cpumask() and
seq_nodemask().
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Paul Jackson <pj@sgi.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Specify how much physically continuous, DMA capable memory will be
allocated at driver initialization time. This allow to create framebuffer
device with larger virtual resolution. Combine with y-panning this can be
used to implement double buffering acceleration method.
Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The legacy i2c model is going away soon, so switch to the new model.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: Petr Vandrovec <VANDROVE@vc.cvut.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some chips appear to have the 2D engine hang during screen redraw,
typically in a sequence of copyarea operations. This appear to be
solved by adding a flush of the engine destination pixel cache
and waiting for the engine to be idle before issuing the accel
operation. The performance impact seems to be fairly small.
Here is a trace on an RV370 (PCI device ID 0x5b64), it records the
RBBM_STATUS register, then the source x/y, destination x/y, and
width/height used for the copy:
----------------------------------------
radeonfb_prim_copyarea: STATUS[00000140] src[210:70] dst[210:60] wh[a0:10]
radeonfb_prim_copyarea: STATUS[00000140] src[2b8:70] dst[2b8:60] wh[88:10]
radeonfb_prim_copyarea: STATUS[00000140] src[348:70] dst[348:60] wh[40:10]
radeonfb_prim_copyarea: STATUS[80020140] src[390:70] dst[390:60] wh[88:10]
radeonfb_prim_copyarea: STATUS[8002613f] src[40:80] dst[40:70] wh[28:10]
radeonfb_prim_copyarea: STATUS[80026139] src[a8:80] dst[a8:70] wh[38:10]
radeonfb_prim_copyarea: STATUS[80026133] src[e8:80] dst[e8:70] wh[80:10]
radeonfb_prim_copyarea: STATUS[8002612d] src[170:80] dst[170:70] wh[30:10]
radeonfb_prim_copyarea: STATUS[80026127] src[1a8:80] dst[1a8:70] wh[8:10]
radeonfb_prim_copyarea: STATUS[80026121] src[1b8:80] dst[1b8:70] wh[88:10]
radeonfb_prim_copyarea: STATUS[8002611b] src[248:80] dst[248:70] wh[68:10]
----------------------------------------
When things are going fine the copies complete before the next ROP is
even issued, but all of a sudden the 2D unit becomes active (bit 17 in
RBBM_STATUS) and the FIFO retry (bit 13) and FIFO pipeline busy (bit
14) are set as well. The FIFO begins to backup until it becomes full.
What happens next is the radeon_fifo_wait() times out, and we access
the chip illegally leading to a bus error which usually wedges the
box. None of this makes it to the console screen, of course :-)
radeon_fifo_wait() should be modified to reset the accelerator when
this timeout happens instead of programming the chip anyways.
----------------------------------------
radeonfb: FIFO Timeout !
ERROR(0): Cheetah error trap taken afsr[0010080005000000] afar[000007f900800e40] TL1(0)
ERROR(0): TPC[595114] TNPC[595118] O7[459788] TSTATE[11009601]
ERROR(0): TPC<radeonfb_copyarea+0xfc/0x248>
ERROR(0): M_SYND(0), E_SYND(0), Privileged
ERROR(0): Highest priority error (0000080000000000) "Bus error response from system bus"
ERROR(0): D-cache idx[0] tag[0000000000000000] utag[0000000000000000] stag[0000000000000000]
ERROR(0): D-cache data0[0000000000000000] data1[0000000000000000] data2[0000000000000000] data3[0000000000000000]
ERROR(0): I-cache idx[0] tag[0000000000000000] utag[0000000000000000] stag[0000000000000000] u[0000000000000000] l[00\
ERROR(0): I-cache INSN0[0000000000000000] INSN1[0000000000000000] INSN2[0000000000000000] INSN3[0000000000000000]
ERROR(0): I-cache INSN4[0000000000000000] INSN5[0000000000000000] INSN6[0000000000000000] INSN7[0000000000000000]
ERROR(0): E-cache idx[800e40] tag[000000000e049f4c]
ERROR(0): E-cache data0[fffff8127d300180] data1[00000000004b5384] data2[0000000000000000] data3[0000000000000000]
Ker:xnel panic - not syncing: Irrecoverable deferred error trap.
----------------------------------------
Another quirk is that these copyarea calls will not happen until the
first drivers/char/vt.c:redraw_screen() occurs. This will only happen
if you 1) VC switch or 2) run "consolechars" or 3) unblank the screen.
This seems to happen because until a redraw_screen() the screen scrolling
method used by fbcon is not finalized yet. I've seen this with other fb
drivers too.
So if all you do is boot straight into X you will never see this bug on
the relevant chips.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: <stable@kernel.org> [2.6.25.x, 2.6.26.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
.. since a failed allocation is being (initially) handled gracefully, and
panic()-ed upon failure explicitly in the function if retries with smaller
sizes failed.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
fix spinlock recursion in hvc_console
stop_machine: remove unused variable
modules: extend initcall_debug functionality to the module loader
export virtio_rng.h
lguest: use get_user_pages_fast() instead of get_user_pages()
mm: Make generic weak get_user_pages_fast and EXPORT_GPL it
lguest: don't set MAC address for guest unless specified
* 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
agp: fix SIS 5591/5592 wrong PCI id
intel/agp: rewrite GTT on resume
agp: use dev_printk when possible
amd64-agp: run fallback when no bridges found, not when driver registration fails
intel_agp: official name for GM45 chipset
The kernel has this really nice facility where if you put "initcall_debug"
on the kernel commandline, it'll print which function it's going to
execute just before calling an initcall, and then after the call completes
it will
1) print if it had an error code
2) checks for a few simple bugs (like leaving irqs off)
and
3) print how long the init call took in milliseconds.
While trying to optimize the boot speed of my laptop, I have been loving
number 3 to figure out what to optimize... ... and then I wished that
the same thing was done for module loading.
This patch makes the module loader use this exact same functionality; it's
a logical extension in my view (since modules are just sort of late
binding initcalls anyway) and so far I've found it quite useful in finding
where things are too slow in my boot.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Hello Rusty,
The entropy device was added after we exported all virtio headers. This
patch adds virtio_rng.h to the exportable userspace headers.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Out of line get_user_pages_fast fallback implementation, make it a weak
symbol, get rid of CONFIG_HAVE_GET_USER_PAGES_FAST.
Export the symbol to modules so lguest can use it.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Inserting a space between the `-' improved the C readability (some languages
allow hyphens within functions and variable names, which is confusing).
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
On my Intel chipset (965GM), the GTT is entirely erased across
suspend/resume. This patch simply re-plays the current mapping at resume
time to restore the table.=20
I noticed this once I started relying on persistent GTT mappings across VT
switch in our GEM work -- the old X server and DRM code carefully unbind
all memory from the GTT on VT switch, but GEM does not bother.
I placed the list management and rewrite code in the generic layer on the
assumption that it will be needed on other hardware, but I did not add the
rewrite call to anything other than the Intel resume function.
Keep a list of current GATT mappings. At resume time, rewrite them into
the GATT. This is needed on Intel (at least) as the entire GATT is
cleared across suspend/resume.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Keith Packard <keithp@keithp.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched, cpu hotplug: fix set_cpus_allowed() use in hotplug callbacks
sched: fix mysql+oltp regression
sched_clock: delay using sched_clock()
sched clock: couple local and remote clocks
sched clock: simplify __update_sched_clock()
sched: eliminate scd->prev_raw
sched clock: clean up sched_clock_cpu()
sched clock: revert various sched_clock() changes
sched: move sched_clock before first use
sched: test runtime rather than period in global_rt_runtime()
sched: fix SCHED_HRTICK dependency
sched: fix warning in hrtick_start_fair()
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix 2.6.27rc1 cannot boot more than 8CPUs
x86: make "apic" an early_param() on 32-bit, NULL check
EFI, x86: fix function prototype
x86, pci-calgary: fix function declaration
x86: work around gcc 3.4.x bug
x86: make "apic" an early_param() on 32-bit
x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk
x86_64: restore the proper NR_IRQS define so larger systems work.
x86: Restore proper vector locking during cpu hotplug
x86: Fix broken VMI in 2.6.27-rc..
x86: fdiv bug detection fix
* 'for-linus' of git://git.o-hand.com/linux-mfd:
mfd: tc6393 cleanup and update
mfd: have TMIO drivers and subdevices depend on ARM
mfd: TMIO MMC driver
mfd: driver for the TMIO NAND controller
mfd: t7l66 MMC platform data
mfd: tc6387 MMC platform data
mfd: Fix 7l66 and 6387 according to the new mfd-core API
mfd: Fix tc6393 according to the new tmio.h
mfd: driver for the TC6387XB TMIO controller.
mfd: driver for the T7L66XB TMIO SoC
mfd: TMIO MMC structures and accessors.
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: Remove include/linux/harrier_defs.h
powerpc: Do not ignore arch/powerpc/include
powerpc: Delete completed "ppc removal" task from feature removal file
powerpc/mm: Fix attribute confusion with htab_bolt_mapping()
powerpc/pci: Don't keep ISA memory hole resources in the tree
powerpc: Zero fill the return values of rtas argument buffer
powerpc/4xx: Update defconfig files for 2.6.27-rc1
powerpc/44x: Incorrect NOR offset in Warp DTS
powerpc/44x: Warp DTS changes for board updates
powerpc/4xx: Cleanup Warp for i2c driver changes.
powerpc/44x: Adjust warp-nand resource end address
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: Limit VPD length for Broadcom 5708S
PCI PM: Export pci_pme_active to drivers
PCI: remove duplicate symbol from pci_ids.h
PCI: check the return value of device_create_bin_file() in pci_create_bus()
PCI: fully restore MSI state at resume time
DMA: make dma-coherent.c documentation kdoc-friendly
PCI: make pci_register_driver() a macro
PCI: add Broadcom 5708S to VPD length quirk
Fix function prototype in header file to match source code:
linux-next-20080807/arch/x86/kernel/efi_64.c💯14: error: symbol 'efi_ioremap' redeclared with different type (originally declared at include2/asm/efi.h:89) - different address spaces
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
certain configs produce:
[ 70.076229] BUG: MAX_LOCKDEP_KEYS too low!
[ 70.080230] turning off the locking correctness validator.
tune them up.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There's no reason for dynamically allocating an estimator object for every
stats object. Directly embed an estimator object into every stats object and
switch to using the kernel-provided list implementation. This makes the code
much simpler and faster, as we do not need to traverse the list of all
estimators to find the one belonging to a stats object. There's no need to use
an rwlock, as we only have one reader. Also reorder the members of the
estimator structure slightly to avoid padding overhead. This can't be done
with the stats object as the members are currently copied to our user space
object via memcpy() and changing it would break ABI.
Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Acked-by: Simon Horman <horms@verge.net.au>
It was only used by code in arch/ppc, and arch/ppc is gone, so remove
the unused harrier_defs.h as well.
Signed-off-by: Paul Mackerras <paulus@samba.org>
There is a overflow by 1 case in the new shrunken hlock code.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As pointed out and tracked by Yinghai Lu <yhlu.kernel@gmail.com>:
Dhaval Giani got:
kernel BUG at arch/x86/kernel/io_apic_64.c:357!
invalid opcode: 0000 [1] SMP
CPU 24
...
his system (x3950) has 8 ioapic, irq > 256
This was caused by:
commit 9b7dc567d0
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Fri May 2 20:10:09 2008 +0200
x86: unify interrupt vector defines
The interrupt vector defines are copied 4 times around with minimal
differences. Move them all into asm-x86/irq_vectors.h
It appears that Thomas did not notice that x86_64 does something
completely different when he merge irq_vectors.h
We can solve this for 2.6.27 by simply reintroducing the old heuristic
for setting NR_IRQS on x86_64 to a usable value, which trivially removes
the regression.
Long term it would be nice to harmonize the handling of ioapic interrupts
of x86_32 and x86_64 so we don't have this kind of confusion.
Dhaval Giani <dhaval@linux.vnet.ibm.com> tested an earlier version of
this patch by YH which confirms simply increasing NR_IRQS fixes the
problem.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Having cpu_online_map change during assign_irq_vector can result
in some really nasty and weird things happening. The one that
bit me last time was accessing non existent per cpu memory for non
existent cpus.
This locking was removed in a sloppy x86_64 and x86_32 merge patch.
Guys can we please try and avoid subtly breaking x86 when we are
merging files together?
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
the names were too generic:
drivers/uio/uio.c:87: error: expected identifier or '(' before 'do'
drivers/uio/uio.c:87: error: expected identifier or '(' before 'while'
drivers/uio/uio.c:113: error: 'map_release' undeclared here (not in a function)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Expose the new lock protection lock.
This can be used to annotate places where we take multiple locks of the
same class and avoid deadlocks by always taking another (top-level) lock
first.
NOTE: we're still bound to the MAX_LOCK_DEPTH (48) limit.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On Fri, 2008-08-01 at 16:26 -0700, Linus Torvalds wrote:
> On Fri, 1 Aug 2008, David Miller wrote:
> >
> > Taking more than a few locks of the same class at once is bad
> > news and it's better to find an alternative method.
>
> It's not always wrong.
>
> If you can guarantee that anybody that takes more than one lock of a
> particular class will always take a single top-level lock _first_, then
> that's all good. You can obviously screw up and take the same lock _twice_
> (which will deadlock), but at least you cannot get into ABBA situations.
>
> So maybe the right thing to do is to just teach lockdep about "lock
> protection locks". That would have solved the multi-queue issues for
> networking too - all the actual network drivers would still have taken
> just their single queue lock, but the one case that needs to take all of
> them would have taken a separate top-level lock first.
>
> Never mind that the multi-queue locks were always taken in the same order:
> it's never wrong to just have some top-level serialization, and anybody
> who needs to take <n> locks might as well do <n+1>, because they sure as
> hell aren't going to be on _any_ fastpaths.
>
> So the simplest solution really sounds like just teaching lockdep about
> that one special case. It's not "nesting" exactly, although it's obviously
> related to it.
Do as Linus suggested. The lock protection lock is called nest_lock.
Note that we still have the MAX_LOCK_DEPTH (48) limit to consider, so anything
that spills that it still up shit creek.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Most the free-standing lock_acquire() usages look remarkably similar, sweep
them into a new helper.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
this can be used to reset a held lock's subclass, for arbitrary-depth
iterated data structures such as trees or lists which have per-node
locks.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Some arch's can't handle sched_clock() being called too early - delay
this until sched_clock_init() has been called.
Reported-by: Bill Gatliff <bgat@billgatliff.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Nishanth Aravamudan <nacc@us.ibm.com>
CC: Russell King - ARM Linux <linux@arm.linux.org.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patchset cleans up the TC6393XB support.
* Add provision for the MMC subdevice
* Disable / enable clocks on suspend / resume
* Remove fragments of badly merged code (eg. linux/fb include etc.)
* Use a device specific clock name to break dependancy on ARM/PXA2XX
* Drop unnecessary resource names
* Switch to tmio_io* accessors
Signed-off-by: Ian Molton <spyro@f2s.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patch adds support for the TC6387XB. Unlike other TMIO devices this one
has only one subdevice and no interrupt mux, however using the MFD framework
allows it to share the TMIO MMC driver.
Signed-off-by: Ian Molton <spyro@f2s.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patchset provides support for the core functinality of the T7L66XB
SoC from Toshiba. Supported in this patchset is the IRQ MUX, MMC controller
and NAND flash controller.
Signed-off-by: Ian Molton <spyro@f2s.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
This patch fixes compilation error if no s3c2410 processor is selected
but the s3c244x is selected. The function s3c2410_baseclk_add() is now
available for all Samsung cpus.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
[ben-linux@fluff.org: Whitespace and description fixups]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
This patch performs the equivalent include directory shuffle for
plat-orion, and fixes up all users.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Fix fatal multi-line kernel-doc error in list.h:
function short description must be on one line.
Error(linux-2.6.27-rc2-git3//include/linux/list.h:318): duplicate section name 'Description'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus-merged' of master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5177/1: arm/mach-sa1100/Makefile: remove CONFIG_SA1100_USB
[ARM] 5166/1: magician: add MAINTAINERS entry
[ARM] fix pnx4008 build errors
[ARM] Fix SMP booting with non-zero PHYS_OFFSET
[ARM] 5185/1: Fix spi num_chipselect for lubbock
[ARM] Move include/asm-arm/arch-* to arch/arm/*/include/mach
[ARM] Add support for arch/arm/mach-*/include and arch/arm/plat-*/include
[ARM] Remove asm/hardware.h, use asm/arch/hardware.h instead
[ARM] Eliminate useless includes of asm/mach-types.h
[ARM] Fix circular include dependency with IRQ headers
avr32: Use <mach/foo.h> instead of <asm/arch/foo.h>
avr32: Introduce arch/avr32/mach-*/include/mach
avr32: Move include/asm-avr32 to arch/avr32/include/asm
[ARM] sa1100_wdt: use reset_status to remember watchdog reset status
[ARM] pxa: introduce reset_status and clear_reset_status for driver's usage
[ARM] pxa: introduce reset.h for reset specific header information
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (99 commits)
pkt_sched: Fix actions referencing
bnx2x: fix logical op
tcp: (whitespace only) fix confusing indentation
pkt_sched: Fix qdisc config when link is down.
[Bluetooth] Add full quirk implementation for btusb driver
[Bluetooth] Removal of unnecessary ignore module parameter
[Bluetooth] Add parameters to control BNEP header compression
ath9k: Revamp wireless mode usage
ath9k: More unused macros
ath9k: Remove a few unused macros and fix indentation
ath9k: Use mac80211's band macros and remove enum hal_freq_band
ath9k: Remove redundant data structure ath9k_txq_info
ath9k: Cleanup data structures related to HW capabilities
ath9k: work around gcc ICEs
ath9k: Add new Atheros IEEE 802.11n driver
ath5k: remove Atheros 11n devices from supported list
list.h: add list_cut_position()
list.h: Add list_splice_tail() and list_splice_tail_init()
p54: swap short slot time dcf values
rt2x00: Block all unsupported modes
...
include/linux/i2c-pnx.h was missed when moving the include files.
Fix it now; it doesn't really need to include mach/i2c.h at all.
Successfully build tested with pnx4008_defconfig, which had
failed in linux-next.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Since all users have been converted over to use <mach/foo.h>, there's no
need for the arch-at32ap directory and associated symlink anymore.
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mad: Test ib_create_send_mad() return with IS_ERR(), not == NULL
IB/mlx4: Allow 4K messages for UD QPs
mlx4_core: Add ethernet fields to CQE struct
IB/ipath: Fix printk format warnings
RDMA/cxgb3: Fix deadlock initializing iw_cxgb3 device
RDMA/cxgb3: Fix up MW access rights
RDMA/cxgb3: Fix QP capabilities
RDMA/cma: Remove padding arrays by using struct sockaddr_storage
IB/ipath: Use unsigned long for irq flags
IPoIB/cm: Set correct SG list in ipoib_cm_init_rx_wr()
In the change in commit 09a05394fe, I
overlooked two nits in the logic and this broke using CLONE_PTRACE
when PTRACE_O_TRACE* are not being used.
A parent that is itself traced at all but not using PTRACE_O_TRACE*,
using CLONE_PTRACE would have its new child fail to be traced.
A parent that is not itself traced at all that uses CLONE_PTRACE
(which should be a no-op in this case) would confuse the bookkeeping
and lead to a crash at exit time.
This restores the missing checks and fixes both failure modes.
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
Export pci_pme_active() to drivers, so that they can clear the
PME_status bit and disable PME# for their devices without involving
ACPI.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
alpha:
CC [M] drivers/usb/gadget/u_ether.o
In file included from include/asm/dma-mapping.h:7,
from include/linux/dma-mapping.h:52,
from include/linux/dmaengine.h:29,
from include/linux/skbuff.h:29,
from include/linux/if_ether.h:114,
from include/linux/etherdevice.h:27,
from drivers/usb/gadget/u_ether.c:29:
include/linux/pci.h: In function 'pci_register_driver':
include/linux/pci.h:673: error: 'KBUILD_MODNAME' undeclared (first use in this function)
include/linux/pci.h:673: error: (Each undeclared identifier is reported only once
include/linux/pci.h:673: error: for each function it appears in.)
Sam says:
The problem is that u_ether.o is used by two modules so when we build it
KBUILD_MODNAME is not defined because kbuild does not know what value to
use.
And in pci.h we have the following inline:
static inline int __must_check pci_register_driver(struct pci_driver *driver)
{
return __pci_register_driver(driver, THIS_MODULE, KBUILD_MODNAME);
}
And alpha uses dma-mapping.h to nullify a number of functions that seem to
require something from pci.h.
Making it a macro fixes this particular problem. However, the underlying issue
of a file using KBUILD_MODNAME and being shared between multiple modules is
*not* addressed. I guess the answer there is "don't do that".
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This adds list_cut_position() which lets you cut a list into
two lists given a pivot in the list.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
If you are using linked lists for queues list_splice() will not do what
you would expect even if you use the elements passed reversed. We need
to handle these differently. We add list_splice_tail() and
list_splice_tail_init().
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Remove includes of asm/hardware.h in addition to asm/arch/hardware.h.
Then, since asm/hardware.h only exists to include asm/arch/hardware.h,
update everything to directly include asm/arch/hardware.h and remove
asm/hardware.h.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
There are 43 includes of asm/mach-types.h by files that don't
reference anything from that file. Remove these unnecessary
includes.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The dm9000 driver reads the chip's MAC address from the attached EEPROM. When
no EEPROM is present, or when the MAC address is invalid, it falls back to
reading the address from the chip.
This patch lets platform code set the desired MAC address through platform
data.
Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Introduce the speed_hi field to ethtool_cmd, using the reserved space,
to expand the speed field to 2^32 Megabits/second.
Making this field expansion now gives us plenty of time to fix up the
user-space pieces that use SIOCETHTOOL before hardware faster than 64
Gb/s is available.
Signed-off-by: Brandon Philips <bphilips@suse.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add ethernet-related fields to struct mlx4_cqe so that the mlx4_en
ethernet NIC driver can share the same definition.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Some USB devices set the protect bit in the INQUIRY data which
currently causes the DIF code in sd to assume (incorrectly) that they
support READ_CAPACITY(16). Fix this (only for the time being) by
making sure we only believe the protect bit in the inquiry data if the
device claims conformance to SCSI-3 or above.
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Add suspend/resume hooks to call soc operation specific
suspend and resume functions. This ensures the camera
chip has been previously resumed, as well as the camera
bus.
These hooks in camera chip drivers should save/restore
chip context between suspend and resume time.
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
- Disable all bits in SIC_IWR unless we are going into a real (DPMC)
power saving mode. Any Interrupt can wake the core form it's idle state.
- Remove deep sleep mode as it is not going to be used anywhere:
We support sleep, sleep deeper and hibernate.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
This re-introduces commit 2b14290078,
which was reverted due to the regression it caused by commit
fca082c9f1.
That regression was not root-caused by the original commit, it was just
uncovered by it, and the real fix was done by Alan Stern in commit
580da34847 ("Fix USB storage hang on
command abort").
We can thus re-introduce the change that was confirmed by Alan Jenkins
to be still required by his odd card reader.
Cc: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (78 commits)
AX.25: Fix sysctl registration if !CONFIG_AX25_DAMA_SLAVE
pktgen: mac count
pktgen: random flow
bridge: Eliminate unnecessary forward delay
bridge: fix compile warning in net/bridge/br_netfilter.c
ipv4: remove unused field in struct flowi (include/net/flow.h).
tg3: Fix 'scheduling while atomic' errors
net: Kill plain NET_XMIT_BYPASS.
net_sched: Add qdisc __NET_XMIT_BYPASS flag
net_sched: Add qdisc __NET_XMIT_STOLEN flag
iwl3945: fix merge mistake for packet injection
iwlwifi: grap nic access before accessing periphery registers
iwlwifi: decrement rx skb counter in scan abort handler
iwlwifi: fix unhandled interrupt when HW rfkill is on
iwlwifi: implement iwl5000_calc_rssi
iwlwifi: memory allocation optimization
iwlwifi: HW bug fixes
p54: Fix potential concurrent access to private data
rt2x00: Disable link tuning in rt2500usb
iwlwifi: Don't use buffer allocated on the stack for led names
...
A documentation cleanup patch. With a minor tweak to clarify units for
kbs.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: mark gross <mgross@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I have a new PCI-E radeon RV380 series card (PCI device ID 5b64) that
hangs in my sparc64 boxes when the init scripts set the font. The problem
goes away if I disable acceleration.
I haven't figured out that bug yet, but along the way I found some
corrections to make based upon some auditing.
1) The RB2D_DC_FLUSH_ALL value used by the kernel fb driver
and the XORG video driver differ. I've made the kernel
match what XORG is using.
2) In radeonfb_engine_reset() we have top-level code structure
that roughly looks like:
if (family is 300, 350, or V350)
do this;
else
do that;
...
if (family is NOT 300, OR
family is NOT 350, OR
family is NOT V350)
do another thing;
this last conditional makes no sense, is always true,
and obviously was likely meant to be "family is NOT
300, 350, or V350". So I've made the code match the
intent.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These functions have been deprecated for some time now but remained until
all legacy callers could be removed. With a few commits in 2.6.26 this
has happened so now we can remove these deprecated functions.
Signed-off-by: Mark Asselstine <mark.asselstine@windriver.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds an SPI driver for the SPI controller found in various Marvell
Orion ARM SoCs. It currently supports only one slave, which must use SPI
mode 0.
[dbrownell@users.sourceforge.net: cleanups, meet specs, pass "sparse"]
Signed-off-by: Shadi Ammouri <shadi@marvell.com>
Signed-off-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The current implementation reports the structure name as
VMCOREINFO_OSRELEASE in VMCOREINFO, e.g.
VMCOREINFO_OSRELEASE=init_uts_ns.name.release
That doesn't make sense because it's always the same. Instead, use the
value, e.g.
VMCOREINFO_OSRELEASE=2.6.26-rc3
That's also what the 'makedumpfile -g' does.
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Cc: "Ken'ichi Ohmichi" <oomichi@mxs.nes.nec.co.jp>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The benefits of a user settable CONFIG_IDE_MAX_HWIFS have become pretty
tiny and are no longer considered worth the trouble of an own option.
Simply always #define MAX_HWIFS to 10.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
dont bother protecting the mtd defines as anything that incorrectly
uses it will get an error during link time anyways ... this prevents
large pointless rebuilds of most files whenever the uclinux mtd map changes state
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
This patch removes an unused field (flags) from struct flowi; it seems
that this "flags" field was used once in the past for multipath
routing with FLOWI_FLAG_MULTIPATHOLDROUTE flag (which does no longer
exist); however, the "flags" field of struct flowi is not used
anymore.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch changes the static MIN_PARTIAL to a dynamic per-cache ->min_partial
value that is calculated from object size. The bigger the object size, the more
pages we keep on the partial list.
I tested SLAB, SLUB, and SLUB with this patch on Jens Axboe's 'netio' example
script of the fio benchmarking tool. The script stresses the networking
subsystem which should also give a fairly good beating of kmalloc() et al.
To run the test yourself, first clone the fio repository:
git clone git://git.kernel.dk/fio.git
and then run the following command n times on your machine:
time ./fio examples/netio
The results on my 2-way 64-bit x86 machine are as follows:
[ the minimum, maximum, and average are captured from 50 individual runs ]
real time (seconds)
min max avg sd
SLAB 22.76 23.38 22.98 0.17
SLUB 22.80 25.78 23.46 0.72
SLUB (dynamic) 22.74 23.54 23.00 0.20
sys time (seconds)
min max avg sd
SLAB 6.90 8.28 7.70 0.28
SLUB 7.42 16.95 8.89 2.28
SLUB (dynamic) 7.17 8.64 7.73 0.29
user time (seconds)
min max avg sd
SLAB 36.89 38.11 37.50 0.29
SLUB 30.85 37.99 37.06 1.67
SLUB (dynamic) 36.75 38.07 37.59 0.32
As you can see from the above numbers, this patch brings SLUB to the same level
as SLAB for this particular workload fixing a ~2% regression. I'd expect this
change to help similar workloads that allocate a lot of objects that are close
to the size of a page.
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
dst_input() was doing something completely absurd, looping
on skb->dst->input() if NET_XMIT_BYPASS was seen, but these
functions never return such an error.
And as a result plain ole' NET_XMIT_BYPASS has no more
references and can be completely killed off.
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy <kaber@trash.net> noticed that it would be nice to
handle NET_XMIT_BYPASS by NET_XMIT_SUCCESS with an internal qdisc flag
__NET_XMIT_BYPASS and to remove the mapping from dev_queue_xmit().
David Miller <davem@davemloft.net> spotted a serious bug in the first
version of this patch.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy <kaber@trash.net> noticed:
"The other problem that affects all qdiscs supporting actions is
TC_ACT_QUEUED/TC_ACT_STOLEN getting mapped to NET_XMIT_SUCCESS
even though the packet is not queued, corrupting upper qdiscs'
qlen counters."
and later explained:
"The reason why it translates it at all seems to be to not increase
the drops counter. Within a single qdisc this could be avoided by
other means easily, upper qdiscs would still increase the counter
when we return anything besides NET_XMIT_SUCCESS though.
This means we need a new NET_XMIT return value to indicate this to
the upper qdiscs. So I'd suggest to introduce NET_XMIT_STOLEN,
return that to upper qdiscs and translate it to NET_XMIT_SUCCESS
in dev_queue_xmit, similar to NET_XMIT_BYPASS."
David Miller <davem@davemloft.net> noticed:
"Maybe these NET_XMIT_* values being passed around should be a set of
bits. They could be composed of base meanings, combined with specific
attributes.
So you could say "NET_XMIT_DROP | __NET_XMIT_NO_DROP_COUNT"
The attributes get masked out by the top-level ->enqueue() caller,
such that the base meanings are the only thing that make their
way up into the stack. If it's only about communication within the
qdisc tree, let's simply code it that way."
This patch is trying to realize these ideas.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Like the page lock change, this also requires name change, so convert the
raw test_and_set bitop to a trylock.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Converting page lock to new locking bitops requires a change of page flag
operation naming, so we might as well convert it to something nicer
(!TestSetPageLocked_Lock => trylock_page, SetPageLocked => set_page_locked).
This also facilitates lockdeping of page lock.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (29 commits)
sh: enable maple_keyb in dreamcast_defconfig.
SH2(A) cache update
nommu: Provide vmalloc_exec().
add addrespace definition for sh2a.
sh: Kill off ARCH_SUPPORTS_AOUT and remnants of a.out support.
sh: define GENERIC_HARDIRQS_NO__DO_IRQ.
sh: define GENERIC_LOCKBREAK.
sh: Save NUMA node data in vmcore for crash dumps.
sh: module_alloc() should be using vmalloc_exec().
sh: Fix up __bug_table handling in module loader.
sh: Add documentation and integrate into docbook build.
sh: Fix up broken kerneldoc comments.
maple: Kill useless private_data pointer.
maple: Clean up maple_driver_register/unregister routines.
input: Clean up maple keyboard driver
maple: allow removal and reinsertion of keyboard driver module
sh: /proc/asids depends on MMU.
arch/sh/boards/mach-se/7343/irq.c: removed duplicated #include
arch/sh/boards/board-ap325rxa.c: removed duplicated #include
sh/boards/Makefile typo fix
...
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
powerpc: Remove use of CONFIG_PPC_MERGE
powerpc: Force printing of 'total_memory' to unsigned long long
powerpc: Fix compiler warning in arch/powerpc/mm/mem.c
powerpc: Move include files to arch/powerpc/include/asm
My last change to tracehook.h made it confuse the kerneldoc parser.
Move the #define's before the comment so it's happy again.
Signed-off-by: Roland McGrath <roland@redhat.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
So copy their contents into the asm-m68k files.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6:
Add DIP switch readout for HFC-4S IOB4ST
Fix remaining big endian issue of hfcmulti
mISDN cleanup user interface
mISDN fix main ISDN Makefile
This reverts commit f9247273cb (and
fb2e405fc1 - "fix fs/nfs/nfsroot.c
compilation" - that fixed a missed conversion).
The changes cause problems for at least the sparc build. Let's re-do
them when the exact issues are resolved.
Requested-by: Andrew Morton <akpm@linux-foundation.org>
Requested-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit 2b14290078, since it
seems to break some other USB storage devices (at least a JMicron USB to
ATA bridge). As such, while it apparently fixes some cardreaders, it
would need to be made conditional on the exact reader it fixes in order
to avoid causing regressions.
Cc: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch makes possible for a driver to specify maximal listen interval
The possibility for user to configure listen interval is not implemented
yet, currently the maximum provided by the driver or 1 is used.
Mac80211 uses config handler to set listen interval for to the driver.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch adds the dtim_period in ieee80211_bss_conf, this allows the low
level driver to know the dtim_period, and to plan power save accordingly.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There are a few places where the RDMA CM code handles IPv6 by doing
struct sockaddr addr;
u8 pad[sizeof(struct sockaddr_in6) -
sizeof(struct sockaddr)];
This is fragile and ugly; handle this in a better way with just
struct sockaddr_storage addr;
[ Also roll in patch from Aleksey Senin <alekseys@voltaire.com> to
switch to struct sockaddr_storage and get rid of padding arrays in
struct rdma_addr. ]
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The ipfragok flag controls whether the packet may be fragmented
either on the local host on beyond. The latter is only valid on
IPv4.
In fact, we never want to do the latter even on IPv4 when PMTU is
enabled. This is because even though we can't fragment packets
within SCTP due to the prtocol's inherent faults, we can still
fragment it at IP layer. By setting the DF bit we will improve
the PMTU process.
RFC 2960 only says that we SHOULD clear the DF bit in this case,
so we're compliant even if we set the DF bit. In fact RFC 4960
no longer has this statement.
Once we make this change, we only need to control the local
fragmentation. There is already a bit in the skb which controls
that, local_df. So this patch sets that instead of using the
ipfragok argument.
The only complication is that there isn't a struct sock object
per transport, so for IPv4 we have to resort to changing the
pmtudisc field for every packet. This should be safe though
as the protocol is single-threaded.
Note that after this patch we can remove ipfragok from the rest
of the stack too.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
from include/asm-powerpc. This is the result of a
mkdir arch/powerpc/include/asm
git mv include/asm-powerpc/* arch/powerpc/include/asm
Followed by a few documentation/comment fixups and a couple of places
where <asm-powepc/...> was being used explicitly. Of the latter only
one was outside the arch code and it is a driver only built for powerpc.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
We can simply wrap in to the dev_set/get_drvdata(), there's no reason
to track an extra level of private data on top of the struct device.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
These were completely inconsistent. Clean these up to take a maple_driver
pointer directly for consistency.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
drivers/video/console/promcon.c:158: error: implicit declaration of
function 'con_protect_unimap'
Introduced by commit a29ccf6f82
("embedded: fix vc_translate operator precedence").
Signed-off-by: Alexander Beregalov <a.beregalov@gmail.com>
Cc: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
It is the only legal environment in which this can be
used.
Add some commentary explaining the situation.
Signed-off-by: David S. Miller <davem@davemloft.net>
No file should be explicitly referencing its own platform headers
by specifying an absolute include path. Fix these paths to use
standard <asm/arch/...> includes.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move platform independent header files to arch/arm/include/asm, leaving
those in asm/arch* and asm/plat* alone.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Fix both the IHEX firmware generation (len field always null, and EOF
marker a byte too short) and loading (struct ihex_binrec needs to be
packed to reflect the on-disk structure).
Signed-off-by: Marc Zyngier <maz@misterjones.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
The channelmap should have the same size on 32 and 64 bit systems
and should not depend on endianess.
Thanks to David Woodhouse for spotting this.
Signed-off-by: Karsten Keil <kkeil@suse.de>
This fixes a bug in operator precedence in the newly introduced vc_translate
macro. Without this fix, the translation of some characters on the
kernel console is garbled.
This patch was copied to the e-mail list previously for testing. Now,
all reports confirm that it works, so this is an official post for
application.
Signed-off-by: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Wire up for FRV the system calls that were added in the last merge window.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Wire up system calls added in the last merge window for the MN10300 arch.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
KVM: s390: Fix kvm on IBM System z10
KVM: Advertise synchronized mmu support to userspace
KVM: Synchronize guest physical memory map to host virtual memory map
KVM: Allow browsing memslots with mmu_lock
KVM: Allow reading aliases with mmu_lock
ARCH=h8300:
init/main.c:781: undefined reference to `___early_initcall_end'
Same problem have
__start___bug_table
__stop___bug_table
__tracedata_start
__tracedata_end
__per_cpu_start
__per_cpu_end
When defining a symbol in vmlinux.lds, use the VMLINUX_SYMBOL macro.
VMLINUX_SYMBOL adds a prefix charactor.
You can't just use straight symbol names in common header files as they
dont take into consideration weird arch-specific ABI conventions. in the
case of Blackfin/h8300, the ABI dictates that any C-visible symbols have
an underscore prefixed to them. Thus all symbols in vmlinux.lds.h need to
be wrapped in VMLINUX_SYMBOL() so that each arch can put hide this magic
in their own files.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: "Mike Frysinger" <vapier.adi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
pata_it821x: Driver updates and reworking
libata.h: replace __FUNCTION__ with __func__
ata_piix: subsys 106b:00a3 is apple ich8m too
libata-core: make sure that ata_force_tbl is freed in case of an error
libata: update atapi disable handling
pata_via: add VX800 flag; add function for fixing h/w bugs
pata_ali: misplaced pci_dev_put()
* 'for-linus' of git://oss.sgi.com:8090/xfs/xfs-pull: (64 commits)
[XFS] Remove vn_revalidate calls in xfs.
[XFS] Now that xfs_setattr is only used for attributes set from ->setattr
[XFS] xfs_setattr currently doesn't just handle the attributes set through
[XFS] fix use after free with external logs or real-time devices
[XFS] A bug was found in xfs_bmap_add_extent_unwritten_real(). In a
[XFS] fix compilation without CONFIG_PROC_FS
[XFS] s/XFS_PURGE_INODE/IRELE/g s/VN_HOLD(XFS_ITOV())/IHOLD()/
[XFS] fix mount option parsing in remount
[XFS] Disable queue flag test in barrier check.
[XFS] streamline init/exit path
[XFS] Fix up problem when CONFIG_XFS_POSIX_ACL is not set and yet we still
[XFS] Don't assert if trying to mount with blocksize > pagesize
[XFS] Don't update mtime on rename source
[XFS] Allow xfs_bmbt_split() to fallback to the lowspace allocator
[XFS] Restore the lowspace extent allocator algorithm
[XFS] use minleft when allocating in xfs_bmbt_split()
[XFS] attrmulti cleanup
[XFS] Check for invalid flags in xfs_attrlist_by_handle.
[XFS] Fix CI lookup in leaf-form directories
[XFS] Use the generic xattr methods.
...
My commit 2b2a1ff64a introduced a regression
(sorry about that) for the odd case of exit_signal=0 (e.g. clone_flags=0).
This is not a normal use, but it's used by a case in the glibc test suite.
Dying with exit_signal=0 sends no signal, but it's supposed to wake up a
parent's blocked wait*() calls (unlike the delayed_group_leader case).
This fixes tracehook_notify_death() and its caller to distinguish a
"signal 0" wakeup from the delayed_group_leader case (with no wakeup).
Signed-off-by: Roland McGrath <roland@redhat.com>
Tested-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://neil.brown.name/md:
md: raid10: wake up frozen array
md: do not count blocked devices as spares
md: do not progress the resync process if the stripe was blocked
md: delay notification of 'active_idle' to the recovery thread
md: fix merge error
md: move async_tx_issue_pending_all outside spin_lock_irq