Commit Graph

160439 Commits

Author SHA1 Message Date
Hendrik Brueckner
684d2fd48e [S390] kernel: Append scpdata to kernel boot command line
Append scpdata to the kernel boot command line. If scpdata starts
with the equal sign (=), the kernel boot command line is replaced.
(For consistency with zIPL and IPL PARM parameters.)

To use scpdata for the kernel boot command line, scpdata must consist
of ascii characters only. If scpdata contains other characters,
scpdata is not appended to the kernel boot command line.
In addition, re-IPL is extended for setting scpdata for the next
Linux reboot.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:46 +02:00
Frank Munzert
6292b9ef5a [S390] tape: use init_timer_on_stack() rather than init_timer()
With CONFIG_DEBUG_OBJECTS_TIMERS=y "chccwdev --online" for a tape device
will fail with message "ODEBUG: object is on stack, but not annotated".
We now use init_timer_on_stack.

Signed-off-by: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:46 +02:00
Sebastian Ott
c630493327 [S390] proper use of device register
Don't use kfree directly after device registration started.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:45 +02:00
Heiko Carstens
c48ff644f2 [S390] hibernation: merge files and move to kernel/
Merge the nearly empty C files and move everything from power/ to
kernel/. That way the files are easier to handle.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:45 +02:00
Heiko Carstens
f3d1263e81 [S390] hibernation: remove dead file
There is no caller of do_after_copyback() anywhere. Remove it.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:44 +02:00
Heiko Carstens
bfe3349b51 [S390] atomic ops: small cleanups
Couple of coding style fixes, replace __inline__ with inline and
remove #ifdef __KERNEL_- since the header file isn't exported.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:44 +02:00
Heiko Carstens
1275105851 [S390] atomic ops: add effecient atomic64 support for 31 bit
Use compare double and swap to implement efficient atomic64 ops for 31 bit.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:43 +02:00
Martin Schwidefsky
6ac2a4ddd1 [S390] improve mcount code
Move the 64 bit mount code from mcount.S into mcount64.S and avoid
code duplication.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:43 +02:00
Heiko Carstens
04efc3be76 [S390] convert/optimize csum_fold() to C
In the meantime gcc generates better code than the old inline
assemblies do. Original inline assembly results in:

lr	%r1,%r2
sr	%r3,%r3
lr	%r2,%r1
srdl	%r2,16
alr	%r2,%r3
alr	%r1,%r2
srl	%r1,16
xilf	%r1,65535
llghr	%r2,%r1
br	%r14

Out of the C code gcc generates this:

rll	%r1,%r2,16
ar	%r1,%r2
srl	%r1,16
xilf	%r1,65535
llghr	%r2,%r1
br	%r14

In addition we don't have any static register allocations anymore and
gcc is free to shuffle instructions around for better pipeline usage.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:43 +02:00
Heiko Carstens
05e7ff7da7 [S390] introduce get_clock_monotonic
Introduce get_clock_monotonic() function which can be used to get a
(fast) timestamp. Resolution is the same as for get_clock(). The
only difference is that the timestamps are monotonic and don't jump
backward or forward.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:42 +02:00
Stefan Haberland
ca99dab01d [S390] dasd: fix message naming
This patch fixes message naming so that generic dasd messages do not
contain the device discipline. For this purpose the dev_ makros are
replaced by pr_ makros for generic dasd messages.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:42 +02:00
Stefan Haberland
68b781fe1b [S390] dasd: optimize cpu usage in goodcase
remove unnecessary dbf call, remove string operations for magic

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:41 +02:00
Stefan Weinhuber
97f604b074 [S390] dasd: fail requests when device state is less then ready
A DASD device that is not ready or online has no defined disk layout,
so all requests that arrive in such a state need to be returned as
failed.

Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:41 +02:00
Sebastian Ott
3ac276f8cb [S390] cio: remove ccw_device init_name
We used the init_name to set the console ccw_device's name early
at the boot stage. This patch moves the name setting (for all ccw
devices) to the point where we actually register the device. At this
time we can do dynamic allocations and therefore use dev_set_name.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:41 +02:00
Sebastian Ott
3b554a14f4 [S390] cio: move final put_device to ccw_device_unregister
We use a test_and_clear_bit to prevent a device from being
unregistered twice. Unfortunately in this cases the "final"
put_device (from device_initialize) was issued more than once,
resulting in an use after free error. Fix this by moving this
put_device to ccw_device_unregister.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:40 +02:00
Sebastian Ott
6ee4fec6be [S390] cio: remove subchannel init_name
We used the init_name to set the console subchannels name early
at the boot stage. With the patch cio: fix memleak in subchannel validation
we moved the name setting to the point where we actually register the
console subchannel. At this time we can do dynamic allocations and therefore
use dev_set_name.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:40 +02:00
Sebastian Ott
ab6aae0902 [S390] cio: fix memleak in subchannel validation
When scanning for new subchannels we have a code path where we allocate
memory for a struct subchannel, set the device name (which is dynamically
allocated now) and do a check if the underlying device is blacklisted - if
so we free the subchannel structure.
Since we have not set up refcounting at this stage, the device name's memory
is lost. Fix this by moving the dev_set_name after the blacklist test.

Note: With this patch the init_name for the console subchannel becomes
virtually obsolete.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:39 +02:00
Sebastian Ott
f014824ee7 [S390] cio: fix use after free in s390 debug feature
When using s390dbf with "%s" in sprintf format strings the string itself
is not copied to the dbf buffer.
Since in this case only pointers are stored in the s390dbf, we should
not use dev_name - which is bound to the lifetime of the device.
Reading this entry from s390dbf after the device was released will cause
an use after free error.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:39 +02:00
Jan Glauber
3f09bb8965 [S390] qdio: remove limited number of debugfs entries
The number of qdio debugfs entries was limited. Remove this limit
and group the queue files in a per device directory.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:39 +02:00
Michael Ernst
217ee6c64a [S390] cio: failing set online/offline processing.
When unit checks trigger sensing the device state is set to W4SENSE
until sense completion; then the device state is set back to
ONLINE. If a unit check occurs while set online or set offline
requests are processed then it might happen that the device's
temporary W4SENSE state causes these functions to terminate,
leaving the device in an inconsistent state when the state is set
back to ONLINE later on so that the device cannot be set online or
offline any longer.
To solve this, set online/offline and related rollback or error
routines are processed only if the device is in a final or
DISCONNECTED state.

Signed-off-by: Michael Ernst <mernst@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:38 +02:00
Sebastian Ott
be7a2ddce6 [S390] cio: ensure to hold a reference for deferred deregistration
Ensure to always hold an extra device reference for scheduling a
subchannel deregistration, by moving the get_device to
ccw_device_schedule_sch_unregister. This fixes an use after free
error in ccw_device_call_sch_unregister where put_device was called
on an already freed device structure.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:38 +02:00
Jan Glauber
e2910bcf8c [S390] qdio: continue polling if the queue is not finished
With commit c38f960809 polling was
stopped for the queue even if new data is available.

Return immediately after scheduling the queue tasklet if the queue
is not done.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:37 +02:00
Sebastian Ott
efd986db2d [S390] cio: increase trace level
Move debug traces for start I/O and interrupt events to exclusive
trace levels. Also change tracing in hot-path from sprintf (costly)
to hex.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:37 +02:00
Sebastian Ott
626e476ae0 [S390] cio: fix not oper handling after failed [on|off]line processing
If online/offline processing of a ccw device fails, resulting in not
operational state, notify the driver and unregister the device in case
the driver dosn't want to keep it.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:37 +02:00
Peter Oberparleiter
1da73bc80b [S390] cio: consolidate subchannel intparm reset
Ensure that the hardware interruption parameter for a subchannel is
reset when the associated subchannel data structure is freed.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:36 +02:00
Heiko Carstens
62733e5a5a [S390] cio: move scsw helper functions to header file
All scsw helper functions are very short and usage of them shouldn't
result in function calls. Therefore we move them to a separate header
file.
Also saves a lot of EXPORT_SYMBOLs.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:36 +02:00
Peter Oberparleiter
1f1148c88a [S390] cio: fix ineffective verify event
Path verification events occurring for offline devices are currently
ignored. As a result, offline devices are not removed, even though
they might no longer be accessible (for example because the last path
to the device was varied offline). Fix this by scheduling a status
evaluation for the affected subchannel when a path verification event
occurs.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-09-11 10:29:36 +02:00
Jens Axboe
500b067c5e writeback: check for registered bdi in flusher add and inode dirty
Also a debugging aid. We want to catch dirty inodes being added to
backing devices that don't do writeback.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 09:20:26 +02:00
Jens Axboe
d993831fa7 writeback: add name to backing_dev_info
This enables us to track who does what and print info. Its main use
is catching dirty inodes on the default_backing_dev_info, so we can
fix that up.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 09:20:26 +02:00
Jens Axboe
f09b00d3e7 writeback: add some debug inode list counters to bdi stats
Add some debug entries to be able to inspect the internal state of
the writeback details.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 09:20:26 +02:00
Jens Axboe
d0bceac747 writeback: get rid of pdflush completely
It is now unused, so kill it off.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 09:20:25 +02:00
Jens Axboe
03ba3782e8 writeback: switch to per-bdi threads for flushing data
This gets rid of pdflush for bdi writeout and kupdated style cleaning.
pdflush writeout suffers from lack of locality and also requires more
threads to handle the same workload, since it has to work in a
non-blocking fashion against each queue. This also introduces lumpy
behaviour and potential request starvation, since pdflush can be starved
for queue access if others are accessing it. A sample ffsb workload that
does random writes to files is about 8% faster here on a simple SATA drive
during the benchmark phase. File layout also seems a LOT more smooth in
vmstat:

 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 0  1      0 608848   2652 375372    0    0     0 71024  604    24  1 10 48 42
 0  1      0 549644   2712 433736    0    0     0 60692  505    27  1  8 48 44
 1  0      0 476928   2784 505192    0    0     4 29540  553    24  0  9 53 37
 0  1      0 457972   2808 524008    0    0     0 54876  331    16  0  4 38 58
 0  1      0 366128   2928 614284    0    0     4 92168  710    58  0 13 53 34
 0  1      0 295092   3000 684140    0    0     0 62924  572    23  0  9 53 37
 0  1      0 236592   3064 741704    0    0     4 58256  523    17  0  8 48 44
 0  1      0 165608   3132 811464    0    0     0 57460  560    21  0  8 54 38
 0  1      0 102952   3200 873164    0    0     4 74748  540    29  1 10 48 41
 0  1      0  48604   3252 926472    0    0     0 53248  469    29  0  7 47 45

where vanilla tends to fluctuate a lot in the creation phase:

 r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
 1  1      0 678716   5792 303380    0    0     0 74064  565    50  1 11 52 36
 1  0      0 662488   5864 319396    0    0     4   352  302   329  0  2 47 51
 0  1      0 599312   5924 381468    0    0     0 78164  516    55  0  9 51 40
 0  1      0 519952   6008 459516    0    0     4 78156  622    56  1 11 52 37
 1  1      0 436640   6092 541632    0    0     0 82244  622    54  0 11 48 41
 0  1      0 436640   6092 541660    0    0     0     8  152    39  0  0 51 49
 0  1      0 332224   6200 644252    0    0     4 102800  728    46  1 13 49 36
 1  0      0 274492   6260 701056    0    0     4 12328  459    49  0  7 50 43
 0  1      0 211220   6324 763356    0    0     0 106940  515    37  1 10 51 39
 1  0      0 160412   6376 813468    0    0     0  8224  415    43  0  6 49 45
 1  1      0  85980   6452 886556    0    0     4 113516  575    39  1 11 54 34
 0  2      0  85968   6452 886620    0    0     0  1640  158   211  0  0 46 54

A 10 disk test with btrfs performs 26% faster with per-bdi flushing. A
SSD based writeback test on XFS performs over 20% better as well, with
the throughput being very stable around 1GB/sec, where pdflush only
manages 750MB/sec and fluctuates wildly while doing so. Random buffered
writes to many files behave a lot better as well, as does random mmap'ed
writes.

A separate thread is added to sync the super blocks. In the long term,
adding sync_supers_bdi() functionality could get rid of this thread again.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 09:20:25 +02:00
Jens Axboe
66f3b8e2e1 writeback: move dirty inodes from super_block to backing_dev_info
This is a first step at introducing per-bdi flusher threads. We should
have no change in behaviour, although sb_has_dirty_inodes() is now
ridiculously expensive, as there's no easy way to answer that question.
Not a huge problem, since it'll be deleted in subsequent patches.

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 09:20:25 +02:00
Jens Axboe
d8a8559cd7 writeback: get rid of generic_sync_sb_inodes() export
This adds two new exported functions:

- writeback_inodes_sb(), which only attempts to writeback dirty inodes on
  this super_block, for WB_SYNC_NONE writeout.
- sync_inodes_sb(), which writes out all dirty inodes on this super_block
  and also waits for the IO to complete.

Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-11 09:20:25 +02:00
Marcin Slusarz
c984123c7a pata_rz1000: use printk_once
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-09-11 02:33:59 -04:00
Shane Huang
78d5ae39af ahci: kill @force_restart and refine CLO for ahci_kick_engine()
This patch refines ahci_kick_engine() after discussion with Tejun about
FBS(FIS-based switching) support preparation:
a. Kill @force_restart and always kick the engine. The only case where
   @force_restart is zero is when it's called from ahci_p5wdh_hardreset()
   Actually at that point, BSY is pretty much guaranteed to be set.
b. If PMP is attached, ignore busy and always do CLO. (AHCI-1.3 9.2)

Signed-off-by: Shane Huang <shane.huang@amd.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-09-11 02:31:52 -04:00
Otavio Salvador
02cb009bb9 pata_cs5535: add pci id for AMD based CS5535 controllers
Signed-off-by: Otavio Salvador <otavio@ossystems.com.br>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-09-11 02:31:31 -04:00
Shane Huang
e2dd90b1ad ahci: Add AMD SB900 SATA/IDE controller device IDs
Add AMD SB900 SATA/IDE controller device IDs.

Signed-off-by: Shane Huang <shane.huang@amd.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-09-11 02:31:27 -04:00
Julia Lawall
041b5eac25 drivers/ata: use resource_size
Use the function resource_size, which reduces the chance of introducing
off-by-one errors in calculating the resource size.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
struct resource *res;
@@

- (res->end - res->start) + 1
+ resource_size(res)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-09-11 02:25:58 -04:00
Roland Dreier
73f526da02 Merge branch 'mad' into for-linus
Conflicts:
	drivers/infiniband/core/mad.c
2009-09-10 21:19:45 -07:00
Roland Dreier
45c448a1c0 Merge branches 'cxgb3', 'ehca', 'ipath', 'ipoib', 'misc', 'mlx4', 'mthca' and 'nes' into for-linus 2009-09-10 21:18:07 -07:00
David S. Miller
9a0da0d19c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2009-09-10 18:17:09 -07:00
Ben Hutchings
5367b6887e x86: Fix code patching for paravirt-alternatives on 486
As reported in <http://bugs.debian.org/511703> and
<http://bugs.debian.org/515982>, kernels with paravirt-alternatives
enabled crash in text_poke_early() on at least some 486-class
processors.

The problem is that text_poke_early() itself uses inline functions
affected by paravirt-alternatives and so will modify instructions that
have already been prefetched.  Pentium and later processors will
invalidate the prefetched instructions in this case, but 486-class
processors do not.

Change sync_core() to limit prefetching on 486-class (and 386-class)
processors, and move the call to sync_core() above the call to the
modifiable local_irq_restore().

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
LKML-Reference: <1252547631.3423.134.camel@localhost>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-09-10 16:50:19 -07:00
James Morris
a3c8b97396 Merge branch 'next' into for-linus 2009-09-11 08:04:49 +10:00
Geert Uytterhoeven
0d03d59d9b md: Fix "strchr" [drivers/md/dm-log-userspace.ko] undefined!
Commit b8313b6da7 ("dm log: remove incorrect
field from userspace table output") added a call to strstr() with a
single-character "needle" string parameter.

Unfortunately some versions of gcc replace such calls to strstr() by calls
to strchr() behind our back.  This causes linking errors if strchr() is
defined as an inline function in <asm/string.h> (e.g. on m68k):

| WARNING: "strchr" [drivers/md/dm-log-userspace.ko] undefined!

Avoid this by explicitly calling strchr() instead.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-09-10 14:55:01 -07:00
Ingo Molnar
e1f8450854 sched: Fix sched::sched_stat_wait tracepoint field
This weird perf trace output:

  cc1-9943  [001]  2802.059479616: sched_stat_wait: task: as:9944 wait: 2801938766276 [ns]

Is caused by setting one component field of the delta to zero
a bit too early. Move it to later.

( Note, this does not affect the NEW_FAIR_SLEEPERS interactivity bug,
  it's just a reporting bug in essence. )

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nikos Chantziaras <realnc@arcor.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <4AA93D34.8040500@arcor.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-10 20:52:54 +02:00
Jeremy Fitzhardinge
78c86e5e56 x86: split __phys_addr out into separate file
Split __phys_addr out into its own file so we can disable
-fstack-protector in a fine-grained fashion.  Also it doesn't
have terribly much to do with the rest of ioremap.c.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
2009-09-10 11:48:55 -07:00
Ingo Molnar
3f2aa307c4 sched: Disable NEW_FAIR_SLEEPERS for now
Nikos Chantziaras and Jens Axboe reported that turning off
NEW_FAIR_SLEEPERS improves desktop interactivity visibly.

Nikos described his experiences the following way:

  " With this setting, I can do "nice -n 19 make -j20" and
    still have a very smooth desktop and watch a movie at
    the same time.  Various other annoyances (like the
    "logout/shutdown/restart" dialog of KDE not appearing
    at all until the background fade-out effect has finished)
    are also gone.  So this seems to be the single most
    important setting that vastly improves desktop behavior,
    at least here. "

Jens described it the following way, referring to a 10-seconds
xmodmap scheduling delay he was trying to debug:

  " Then I tried switching NO_NEW_FAIR_SLEEPERS on, and then
    I get:

    Performance counter stats for 'xmodmap .xmodmap-carl':

         9.009137  task-clock-msecs         #      0.447 CPUs
               18  context-switches         #      0.002 M/sec
                1  CPU-migrations           #      0.000 M/sec
              315  page-faults              #      0.035 M/sec

    0.020167093  seconds time elapsed

    Woot! "

So disable it for now. In perf trace output i can see weird
delta timestamps:

  cc1-9943  [001]  2802.059479616: sched_stat_wait: task: as:9944 wait: 2801938766276 [ns]

That nsec field is not supposed to be that large. More digging
is needed - but lets turn it off while the real bug is found.

Reported-by: Nikos Chantziaras <realnc@arcor.de>
Tested-by: Nikos Chantziaras <realnc@arcor.de>
Reported-by: Jens Axboe <jens.axboe@oracle.com>
Tested-by: Jens Axboe <jens.axboe@oracle.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <4AA93D34.8040500@arcor.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-10 20:34:48 +02:00
David S. Miller
b73d884756 sparc64: Initial niagara2 perf counter support.
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-10 07:42:02 -07:00
David S. Miller
660d13765f sparc64: Perf counter 'nop' event is not constant.
On Niagara-2, for example, it's going to be different.  So make
it something specified in sparc_pmu.

Signed-off-by: David S. Miller <davem@davemloft.net>
2009-09-10 07:13:26 -07:00