On Thu, Jul 24, 2008 at 03:43:44PM -0700, Linus Torvalds wrote:
> So how about this patch as a starting point? This is the RightThing(tm) to
> do regardless, and if it then makes it easier to do some other cleanups,
> we should do it first. What do you think?
restore_fpu_checking() calls init_fpu() in error conditions.
While this is wrong(as our main intention is to clear the fpu state of
the thread), this was benign before commit 92d140e21f ("x86: fix taking
DNA during 64bit sigreturn").
Post commit 92d140e21f, live FPU registers may not belong to this
process at this error scenario.
In the error condition for restore_fpu_checking() (especially during the
64bit signal return), we are doing init_fpu(), which saves the live FPU
register state (possibly belonging to some other process context) into
the thread struct (through unlazy_fpu() in init_fpu()). This is wrong
and can leak the FPU data.
For the signal handler restore error condition in restore_i387(), clear
the fpu state present in the thread struct(before ultimately sending a
SIGSEGV for badframe).
For the paranoid error condition check in math_state_restore(), send a
SIGSEGV, if we fail to restore the state.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: <stable@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This fixes footbridge_defconfig:
drivers/pnp/resource.c: In function 'pci_dev_uses_irq':
drivers/pnp/resource.c:317: error: implicit declaration of function 'pci_get_legacy_ide_irq'
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
arm's fls() is implemented as a macro, causing it to misbehave when passed
64-bit arguments. Fix.
Cc: Nickolay Vinogradov <nickolay@protei.ru>
Tested-by: Krzysztof Halasa <khc@pm.waw.pl>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Calculating the number of pages from given address and length numbers is a task
required in multiple IOMMU implementations. So implement this as a generic
function into the IOMMU helper code.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86 kernels on IBM Summit based systems will only online 1 CPU because the
phys_cpu_present_map is not set up correctly. Patch below applied to
2.6.26-git10.
Signed-off-by: Chris McDermott <lcm@linux.vnet.ibm.com>
Tested-by: Tim Pepper <lnxninga@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix build bug introduced by 95b68dec0d "calgary iommu: use the first
kernels TCE tables in kdump":
arch/x86/kernel/built-in.o: In function `calgary_iommu_init':
(.init.text+0x8399): undefined reference to `elfcorehdr_addr'
arch/x86/kernel/built-in.o: In function `calgary_iommu_init':
(.init.text+0x856c): undefined reference to `elfcorehdr_addr'
arch/x86/kernel/built-in.o: In function `detect_calgary':
(.init.text+0x8c68): undefined reference to `elfcorehdr_addr'
arch/x86/kernel/built-in.o: In function `detect_calgary':
(.init.text+0x8d0c): undefined reference to `elfcorehdr_addr'
make elfcorehdr_addr a generally available symbol.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Removes legacy reinvent-the-wheel type thing. The generic
machinery integrates much better to automated debugging aids
such as kerneloops.org (and others), and is unambiguous due to
better naming. Non-intuively BUG_TRAP() is actually equal to
WARN_ON() rather than BUG_ON() though some might actually be
promoted to BUG_ON() but I left that to future.
I could make at least one BUILD_BUG_ON conversion.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for populating an SPI bus based on data in the
OF device tree. This is useful for powerpc platforms which use the
device tree instead of discrete code for describing platform layout.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
spi_new_device() allocates and registers an spi device all in one swoop.
If the driver needs to add extra data to the spi_device before it is
registered, then this causes problems. This is needed for OF device
tree support so that the SPI device tree helper can add a pointer to
the device node after the device is allocated, but before the device
is registered. OF aware SPI devices can then retrieve data out of the
device node to populate a platform data structure.
This patch splits the allocation and registration portions of code out
of spi_new_device() and creates two new functions; spi_alloc_device()
and spi_register_device(). spi_new_device() is modified to use the new
functions for allocation and registration. None of the existing users
of spi_new_device() should be affected by this change.
Drivers using the new API can forego the use of spi_board_info
structure to describe the device layout and populate data into the
spi_device structure directly.
This change is in preparation for adding an OF device tree parser to
generate spi_devices based on data in the device tree.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
SPI has a similar problem as I2C in that it needs to determine an
appropriate modalias value for each device node. This patch adapts
the of_i2c of_find_i2c_driver() function to be usable by of_spi also.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Introduced by commit aaca0bdca5 ("flag
parameters: paccept"):
net/socket.c:1515:17: error: symbol 'sys_paccept' redeclared with different type (originally declared at include/linux/syscalls.h:413) - incompatible argument 4 (different address spaces)
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This wires up the recently added Wire up signalfd4, eventfd2,
epoll_create1, dup3, pipe2, and inotify_init1 system calls.
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to the addition of __attribute__((__cold__)) to a few symbols
without adjusting the linker scripts, those symbols currently may end
up outside the [_stext,_etext) range, as they get placed in
.text.unlikely by (at least) gcc 4.3.0. This may confuse code not only
outside of the kernel, symbol_put_addr()'s BUG() could also trigger.
Hence we need to add .text.unlikely (and for future uses of
__attribute__((__hot__)) also .text.hot) to the TEXT_TEXT() macro.
Issue observed by Lukas Lipavsky.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Tested-by: Lukas Lipavsky <llipavsky@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Move it to the top-level file to decide if we install/check
the generic headers or the arch specific headers.
This revealed a long standing bug where "make headers_check_all"
relied on the files in asm/ for the current architecture.
So make headers_check_all is now broken by this commit.
In addition:
o add a simpler way to detect if an arch support
exporting header files.
o add 'set -e;' so we error out early if
make headers_check_all fails.
o add sparc64 and cris to arch we do not process
in make headers_*_all because:
sparc64 - use sparc to export headers
cris - is know seriously broken
Includes suggestions from: David Woodhouse
<dwmw2@infradead.org>.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: David Woodhouse <dwmw2@infradead.org>
These got unintentionally moved, put them back as x86 provides its own
versions.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes the dummy asm/kvm.h files on architectures not (yet)
supporting KVM and uses the same conditional headers installation as
already used for a.out.h .
Also removed are superfluous install rules in the s390 and x86 Kbuild
files (they are already in Kbuild.asm).
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (34 commits)
powerpc: Wireup new syscalls
Move update_mmu_cache() declaration from tlbflush.h to pgtable.h
powerpc/pseries: Remove kmalloc call in handling writes to lparcfg
powerpc/pseries: Update arch vector to indicate support for CMO
ibmvfc: Add support for collaborative memory overcommit
ibmvscsi: driver enablement for CMO
ibmveth: enable driver for CMO
ibmveth: Automatically enable larger rx buffer pools for larger mtu
powerpc/pseries: Verify CMO memory entitlement updates with virtual I/O
powerpc/pseries: vio bus support for CMO
powerpc/pseries: iommu enablement for CMO
powerpc/pseries: Add CMO paging statistics
powerpc/pseries: Add collaborative memory manager
powerpc/pseries: Utilities to set firmware page state
powerpc/pseries: Enable CMO feature during platform setup
powerpc/pseries: Split retrieval of processor entitlement data into a helper routine
powerpc/pseries: Add memory entitlement capabilities to /proc/ppc64/lparcfg
powerpc/pseries: Split processor entitlement retrieval and gathering to helper routines
powerpc/pseries: Remove extraneous error reporting for hcall failures in lparcfg
powerpc: Fix compile error with binutils 2.15
...
Fixed up conflict in arch/powerpc/platforms/52xx/Kconfig manually.
Preliminary support for the Intel 5100 MCH. CE and UE errors are reported
along with the current DIMM label information and other memory parameters.
Reasons why this is preliminary:
1) This chip has 2 independent memory controllers which, for best
perforance, use interleaved accesses to the DDR2 memory. This
architecture does not map very well to the current edac data structures
which depend on symmetric channel access to the interleaved data.
Without core changes, the best I could do for now is to map both memory
controllers to different csrows (first all ranks of controller 0, then
all ranks of controller 1). Someone much more familiar with the edac
core than I will probably need to come up with a more general data
structure to handle the interleaving and de-interleaving of the two
memory controllers.
2) I have not yet tackled the de-interleaving of the rank/controller
address space into the physical address space of the CPU. There is
nothing fundamentally missing, it is just ending up to be a lot of
code, and I'd rather keep it separate for now, esp since it doesn't
work yet...
3) The code depends on a particular i5100 chip select to DIMM mainboard
chip select mapping. This mapping seems obvious to me in order to
support dual and single ranked memory, but it is not unique and DIMM
labels could be wrong on other mainboards. There is no way to query
this mapping that I know of.
4) The code requires that the i5100 is in 32GB mode. Only 4 ranks per
controller, 2 ranks per DIMM are supported. I do not have hardware
(nor do I expect to have hardware anytime soon) for the 48GB (6 ranks
per controller) mode.
5) The serial presence detect code should be broken out into a "real"
i2c driver so that decode-dimms.pl can work.
Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Implement the get_parent export operation by sending a LOOKUP request with
".." as the name.
Implement looking up an inode by node ID after it has been evicted from
the cache. This is done by seding a LOOKUP request with "." as the name
(for all file types, not just directories).
The filesystem can set the FUSE_EXPORT_SUPPORT flag in the INIT reply, to
indicate that it supports these special lookups.
Thanks to John Muir for the original implementation of this feature.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: David Teigland <teigland@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use a special error value FILE_LOCK_DEFERRED to mean that a locking
operation returned asynchronously. This is returned by
posix_lock_file() for sleeping locks to mean that the lock has been
queued on the block list, and will be woken up when it might become
available and needs to be retried (either fl_lmops->fl_notify() is
called or fl_wait is woken up).
f_op->lock() to mean either the above, or that the filesystem will
call back with fl_lmops->fl_grant() when the result of the locking
operation is known. The filesystem can do this for sleeping as well
as non-sleeping locks.
This is to make sure, that return values of -EAGAIN and -EINPROGRESS by
filesystems are not mistaken to mean an asynchronous locking.
This also makes error handling in fs/locks.c and lockd/svclock.c slightly
cleaner.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: David Teigland <teigland@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add members for memory reclaim delay to taskstats, and accumulate them in
__delayacct_add_tsk() .
Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp>
Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sometimes, application responses become bad under heavy memory load.
Applications take a bit time to reclaim memory. The statistics, how long
memory reclaim takes, will be useful to measure memory usage.
This patch adds accounting memory reclaim to per-task-delay-accounting for
accounting the time of do_try_to_free_pages().
<i.e>
- When System is under low memory load,
memory reclaim may not occur.
$ free
total used free shared buffers cached
Mem: 8197800 1577300 6620500 0 4808 1516724
-/+ buffers/cache: 55768 8142032
Swap: 16386292 0 16386292
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 5069748 10612 3014060 0 0 0 0 3 26 0 0 100 0
0 0 0 5069748 10612 3014060 0 0 0 0 4 22 0 0 100 0
0 0 0 5069748 10612 3014060 0 0 0 0 3 18 0 0 100 0
Measure the time of tar command.
$ ls -s test.dat
1501472 test.dat
$ time tar cvf test.tar test.dat
real 0m13.388s
user 0m0.116s
sys 0m5.304s
$ ./delayget -d -p <pid>
CPU count real total virtual total delay total
428 5528345500 5477116080 62749891
IO count delay total
338 8078977189
SWAP count delay total
0 0
RECLAIM count delay total
0 0
- When system is under heavy memory load
memory reclaim may occur.
$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 7159032 49724 1812 3012 0 0 0 0 3 24 0 0 100 0
0 0 7159032 49724 1812 3012 0 0 0 0 4 24 0 0 100 0
0 0 7159032 49848 1812 3012 0 0 0 0 3 22 0 0 100 0
In this case, one process uses more 8G memory
by execution of malloc() and memset().
$ time tar cvf test.tar test.dat
real 1m38.563s <- increased by 85 sec
user 0m0.140s
sys 0m7.060s
$ ./delayget -d -p <pid>
CPU count real total virtual total delay total
9021 7140446250 7315277975 923201824
IO count delay total
8965 90466349669
SWAP count delay total
3 21036367
RECLAIM count delay total
740 61011951153
In the later case, the value of RECLAIM is increasing.
So, taskstats can show how much memory reclaim influences TAT.
Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujistu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Report per-thread I/O statistics in /proc/pid/task/tid/io and aggregate
parent I/O statistics in /proc/pid/io. This approach follows the same
model used to account per-process and per-thread CPU times.
As a practial application, this allows for example to quickly find the top
I/O consumer when a process spawns many child threads that perform the
actual I/O work, because the aggregated I/O statistics can always be found
in /proc/pid/io.
[ Oleg Nesterov points out that we should check that the task is still
alive before we iterate over the threads, but also says that we can do
that fixup on top of this later. - Linus ]
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: Matt Heaton <matt@hostmonster.com>
Cc: Shailabh Nagar <nagar@watson.ibm.com>
Acked-by-with-comments: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allocate the structure on the first call to sys_acct(). After this each
namespace, that ordered the accounting, will live with this structure till
its own death.
Two notes
- routines, that close the accounting on fs umount time use
the init_pid_ns's acct by now;
- accounting routine accounts to dying task's namespace
(also by now).
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All the bsdacct-related info will be stored in the area, pointer by this
one.
It will be NULL automatically for all new namespaces.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adapt acct_update_integrals() to include user time when calculating the time
difference. The units of acct_rss_mem1 and acct_vm_mem1 are also changed from
pages-jiffies to pages-usecs to avoid calling jiffies_to_usecs() in
xacct_add_tsk() which might overflow.
Signed-off-by: Jonathan Lim <jlim@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It seems to me that it was a mistake marking this function as deprecated
and scheduling it for removal, rather than resolutely removing it after
the last caller's death.
Anyway - better late, then never.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This one had the only users so far - the kill_proc, which is removed, so
drop this (invalid in namespaced world) call too.
And of course - erase all references on it from comments.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This function operated on a pid_t to kill a task, which is no longer valid
in a containerized system.
It has finally lost all its users and we can safely remove it from the
tree.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When struct pid is built on a 64 bit platform gcc has to insert padding to
maintain the correct alignment, by simply reordering its members the
memory usage shrinks from 88 bytes to 80.
I've successfully run with this patch on my desktop AMD64 machine.
There are no significant kernel size changes to a default config.X86_64
on the latest git v2.6.26-rc1
text data bss dec hex filename
5404828 976760 734280 7115868 6c945c vmlinux
5404811 976760 734280 7115851 6c944b vmlinux.pid-patch
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds proper prototypes for pid{hash,map}_init() in
include/linux/pid_namespace.h
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current two-stage scheme of removing PDE emphasizes one bug in proc:
open
rmmod
remove_proc_entry
close
->release won't be called because ->proc_fops were cleared. In simple
cases it's small memory leak.
For every ->open, ->release has to be done. List of openers is introduced
which is traversed at remove_proc_entry() if neeeded.
Discussions with Al long ago (sigh).
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch moves the extern of struct proc_kmsg_operations to
fs/proc/internal.h and adds an #include "internal.h" to fs/proc/kmsg.c
so that the latter sees the former.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ELF_CORE_EFLAGS is already used by the binfmt_elf coredumper to set correct
arch specific ELF header flags on coredumps. Use it for kcore dumps as well.
At the moment, this affects the CRIS and the H8300 arch.
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch proposes an alternative to the "magical
positive-versus-negative number trick" Andrew complained about last week
in http://lkml.org/lkml/2008/6/24/418.
This had been introduced with the patches that scale msgmni to the amount
of lowmem. With these patches, msgmni has a registered notification
routine that recomputes msgmni value upon memory add/remove or ipc
namespace creation/ removal.
When msgmni is changed from user space (i.e. value written to the proc
file), that notification routine is unregistered, and the way to make it
registered back is to write a negative value into the proc file. This is
the "magical positive-versus-negative number trick".
To fix this, a new proc file is introduced: /proc/sys/kernel/auto_msgmni.
This file acts as ON/OFF for msgmni automatic recomputing.
With this patch, the process is the following:
1) kernel boots in "automatic recomputing mode"
/proc/sys/kernel/msgmni contains the value that has been computed (depends
on lowmem)
/proc/sys/kernel/automatic_msgmni contains "1"
2) echo <val> > /proc/sys/kernel/msgmni
. sets msg_ctlmni to <val>
. de-activates automatic recomputing (i.e. if, say, some memory is added
msgmni won't be recomputed anymore)
. /proc/sys/kernel/automatic_msgmni now contains "0"
3) echo "0" > /proc/sys/kernel/automatic_msgmni
. de-activates msgmni automatic recomputing
this has the same effect as 2) except that msg_ctlmni's value stays
blocked at its current value)
3) echo "1" > /proc/sys/kernel/automatic_msgmni
. recomputes msgmni's value based on the current available memory size
and number of ipc namespaces
. re-activates automatic recomputing for msgmni.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The attached patch:
- reverses the locking order of ulp->lock and sem_lock:
Previously, it was first ulp->lock, then inside sem_lock.
Now it's the other way around.
- converts the undo structure to rcu.
Benefits:
- With the old locking order, IPC_RMID could not kfree the undo structures.
The stale entries remained in the linked lists and were released later.
- The patch fixes a a race in semtimedop(): if both IPC_RMID and a semget() that
recreates exactly the same id happen between find_alloc_undo() and sem_lock,
then semtimedop() would access already kfree'd memory.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make idr_find rcu-safe: it can now be called inside an rcu_read critical
section.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Jim Houston <jim.houston@comcast.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Do some code factorization in the return code analysis.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Jim Houston <jim.houston@comcast.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After scalability problems have been detected when using the sysV ipcs, I have
proposed to use an RCU based implementation of the IDR api instead (see
threads http://lkml.org/lkml/2008/4/11/212 and
http://lkml.org/lkml/2008/4/29/295).
This resulted in many people asking to convert the idr API and make it rcu
safe (because most of the code was duplicated and thus unmaintanable and
unreviewable).
So here is a first attempt.
The important change wrt to the idr API itself is during idr removes: idr
layers are freed after a grace period, instead of being moved to the free
list.
The important change wrt to ipcs, is that idr_find() can now be called
locklessly inside a rcu read critical section.
Here are the results I've got for the pmsg test sent by Manfred:
2.6.25-rc3-mm1 2.6.25-rc3-mm1+ 2.6.25-mm1 Patched 2.6.25-mm1
1 1168441 1064021 876000 947488
2 1094264 921059 1549592 1730685
3 2082520 1738165 1694370 2324880
4 2079929 1695521 404553 2400408
5 2898758 406566 391283 3246580
6 2921417 261275 263249 3752148
7 3308761 126056 191742 4243142
8 3329456 100129 141722 4275780
1st column: stock 2.6.25-rc3-mm1
2nd column: 2.6.25-rc3-mm1 + ipc patches (store ipcs into idrs)
3nd column: stock 2.6.25-mm1
4th column: 2.6.25-mm1 + this pacth series.
This patch:
Add an rcu_head to the idr_layer structure in order to free it after a grace
period.
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Jim Houston <jim.houston@comcast.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kdump kernel fails to boot with calgary iommu and aacraid driver on a x366
box. The ongoing dma's of aacraid from the first kernel continue to exist
until the driver is loaded in the kdump kernel. Calgary is initialized
prior to aacraid and creation of new tce tables causes wrong dma's to
occur. Here we try to get the tce tables of the first kernel in kdump
kernel and use them. While in the kdump kernel we do not allocate new tce
tables but instead read the base address register contents of calgary
iommu and use the tables that the registers point to. With these changes
the kdump kernel and hence aacraid now boots normally.
Signed-off-by: Chandru Siddalingappa <chandru@in.ibm.com>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
workqueue_cpu_callback(CPU_DEAD) flushes cwq->thread under
cpu_maps_update_begin(). This means that the multithreaded workqueues
can't use get_online_cpus() due to the possible deadlock, very bad and
very old problem.
Introduce the new state, CPU_POST_DEAD, which is called after
cpu_hotplug_done() but before cpu_maps_update_done().
Change workqueue_cpu_callback() to use CPU_POST_DEAD instead of CPU_DEAD.
This means that create/destroy functions can't rely on get_online_cpus()
any longer and should take cpu_add_remove_lock instead.
[akpm@linux-foundation.org: fix CONFIG_SMP=n]
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Paul Menage <menage@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Most of users of flush_workqueue() can be changed to use cancel_work_sync(),
but sometimes we really need to wait for the completion and cancelling is not
an option. schedule_on_each_cpu() is good example.
Add the new helper, flush_work(work), which waits for the completion of the
specific work_struct. More precisely, it "flushes" the result of of the last
queue_work() which is visible to the caller.
For example, this code
queue_work(wq, work);
/* WINDOW */
queue_work(wq, work);
flush_work(work);
doesn't necessary work "as expected". What can happen in the WINDOW above is
- wq starts the execution of work->func()
- the caller migrates to another CPU
now, after the 2nd queue_work() this work is active on the previous CPU, and
at the same time it is queued on another. In this case flush_work(work) may
return before the first work->func() completes.
It is trivial to add another helper
int flush_work_sync(struct work_struct *work)
{
return flush_work(work) || wait_on_work(work);
}
which works "more correctly", but it has to iterate over all CPUs and thus
it much slower than flush_work().
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Max Krasnyansky <maxk@qualcomm.com>
Acked-by: Jarek Poplawski <jarkao2@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that we have core_state->dumper list we can use it to wake up the
sub-threads waiting for the coredump completion.
This uglifies the code and .text grows by 47 bytes, but otoh mm_struct
lessens by sizeof(struct completion). Also, with this change we can
decouple exit_mm() from the coredumping code.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
binfmt->core_dump() has to iterate over the all threads in system in order
to find the coredumping threads and construct the list using the
GFP_ATOMIC allocations.
With this patch each thread allocates the list node on exit_mm()'s stack and
adds itself to the list.
This allows us to do further changes:
- simplify ->core_dump()
- change exit_mm() to clear ->mm first, then wait for ->core_done.
this makes the coredumping process visible to oom_kill
- kill mm->core_done
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Turn core_state->nr_threads into atomic_t and kill now unneeded
down_write(&mm->mmap_sem) in exit_mm().
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move mm->core_waiters into "struct core_state" allocated on stack. This
shrinks mm_struct a little bit and allows further changes.
This patch mostly does s/core_waiters/core_state. The only essential
change is that coredump_wait() must clear mm->core_state before return.
The coredump_wait()'s path is uglified and .text grows by 30 bytes, this
is fixed by the next patch.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm->core_startup_done points to "struct completion startup_done" allocated
on the coredump_wait()'s stack. Introduce the new structure, core_state,
which holds this "struct completion". This way we can add more info
visible to the threads participating in coredump without enlarging
mm_struct.
No changes in affected .o files.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kill PF_BORROWED_MM. Change use_mm/unuse_mm to not play with ->flags, and
do s/PF_BORROWED_MM/PF_KTHREAD/ for a couple of other users.
No functional changes yet. But this allows us to do further
fixes/cleanups.
oom_kill/ptrace/etc often check "p->mm != NULL" to filter out the
kthreads, this is wrong because of use_mm(). The problem with
PF_BORROWED_MM is that we need task_lock() to avoid races. With this
patch we can check PF_KTHREAD directly, or use a simple lockless helper:
/* The result must not be dereferenced !!! */
struct mm_struct *__get_task_mm(struct task_struct *tsk)
{
if (tsk->flags & PF_KTHREAD)
return NULL;
return tsk->mm;
}
Note also ecard_task(). It runs with ->mm != NULL, but it's the kernel
thread without PF_BORROWED_MM.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce the new PF_KTHREAD flag to mark the kernel threads. It is set
by INIT_TASK() and copied to the forked childs (we could set it in
kthreadd() along with PF_NOFREEZE instead).
daemonize() was changed as well. In that case testing of PF_KTHREAD is
racy, but daemonize() is hopeless anyway.
This flag is cleared in do_execve(), before search_binary_handler().
Probably not the best place, we can do this in exec_mmap() or in
start_thread(), or clear it along with PF_FORKNOEXEC. But I think this
doesn't matter in practice, and if do_execve() fails kthread should die
soon.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ptrace_stop() has some complicated checks to prevent the scheduling in the
TASK_TRACED state with the pending SIGKILL, but these checks are racy, and
they depend on arch_ptrace_stop_needed().
This patch assumes that the traced task should die asap if it was killed by
SIGKILL, in that case schedule()->signal_pending_state() has no reason to
ignore the TASK_WAKEKILL part of TASK_TRACED, and we can kill this nasty
special case.
Note: do_exit()->ptrace_notify() is special, the killed task can already
dequeue SIGKILL at this point. Another indication that fatal_signal_pending()
is not exactly right.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch contains the following cleanups for the asm/ptrace.h
userspace headers:
- include/asm-generic/Kbuild.asm already lists ptrace.h, remove
the superfluous listings in the Kbuild files of the following
architectures:
- cris
- frv
- powerpc
- x86
- don't expose function prototypes and macros to userspace:
- arm
- blackfin
- cris
- mn10300
- parisc
- remove #ifdef CONFIG_'s around #define's:
- blackfin
- m68knommu
- sh: AFAIK __SH5__ should work in both kernel and userspace,
no need to leak CONFIG_SUPERH64 to userspace
- xtensa: cosmetical change to remove empty
#ifndef __ASSEMBLY__ #else #endif
from the userspace headers
Not changed by this patch is the fact that the following architectures
have a different struct pt_regs depending on CONFIG_ variables:
- h8300
- m68knommu
- mips
This does not work in userspace.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Chris Zankel <chris@zankel.net>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add an interface to set limit. This is necessary to memory resource
controller because it shrinks usage at set limit.
Other controllers may not need this interface to shrink usage because
shrinking is not necessary or impossible.
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A new call, mem_cgroup_shrink_usage() is added for shmem handling and
relacing non-standard usage of mem_cgroup_charge/uncharge.
Now, shmem calls mem_cgroup_charge() just for reclaim some pages from
mem_cgroup. In general, shmem is used by some process group and not for
global resource (like file caches). So, it's reasonable to reclaim pages
from mem_cgroup where shmem is mainly used.
[hugh@veritas.com: shmem_getpage release page sooner]
[hugh@veritas.com: mem_cgroup_shrink_usage css_put]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch changes page migration under memory controller to use a
different algorithm. (thanks to Christoph for new idea.)
Before:
- page_cgroup is migrated from an old page to a new page.
After:
- a new page is accounted , no reuse of page_cgroup.
Pros:
- We can avoid compliated lock depndencies and races in migration.
Cons:
- new param to mem_cgroup_charge_common().
- mem_cgroup_getref() is added for handling ref_cnt ping-pong.
This version simplifies complicated lock dependency in page migraiton
under memory resource controller.
new refcnt sequence is following.
a mapped page:
prepage_migration() ..... +1 to NEW page
try_to_unmap() ..... all refs to OLD page is gone.
move_pages() ..... +1 to NEW page if page cache.
remap... ..... all refs from *map* is added to NEW one.
end_migration() ..... -1 to New page.
page's mapcount + (page_is_cache) refs are added to NEW one.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
cgroup_clone creates a new cgroup with the pid of the task. This works
correctly for unshare, but for clone cgroup_clone is called from
copy_namespaces inside copy_process, which happens before the new pid is
created. As a result, the new cgroup was created with current's pid.
This patch:
1. Moves the call inside copy_process to after the new pid
is created
2. Passes the struct pid into ns_cgroup_clone (as it is not
yet attached to the task)
3. Passes a name from ns_cgroup_clone() into cgroup_clone()
so as to keep cgroup_clone() itself simpler
4. Uses pid_vnr() to get the process id value, so that the
pid used to name the new cgroup is always the pid as it
would be known to the task which did the cloning or
unsharing. I think that is the most intuitive thing to
do. This way, task t1 does clone(CLONE_NEWPID) to get
t2, which does clone(CLONE_NEWPID) to get t3, then the
cgroup for t3 will be named for the pid by which t2 knows
t3.
(Thanks to Dan Smith for finding the main bug)
Changelog:
June 11: Incorporate Paul Menage's feedback: don't pass
NULL to ns_cgroup_clone from unshare, and reduce
patch size by using 'nodename' in cgroup_clone.
June 10: Original version
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge Hallyn <serge@us.ibm.com>
Acked-by: Paul Menage <menage@google.com>
Tested-by: Dan Smith <danms@us.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently res_counter_write() is a raw file handler even though it's
ultimately taking a number, since in some cases it wants to
pre-process the string when converting it to a number.
This patch converts res_counter_write() from a raw file handler to a
write_string() handler; this allows some of the boilerplate
copying/locking/checking to be removed, and simplies the cleanup path,
since these functions are now performed by the cgroups framework.
[lizf@cn.fujitsu.com: build fix]
Signed-off-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch contains cleanups suggested by reviewers for the recent
write_string() patchset:
- pair cgroup_lock_live_group() with cgroup_unlock() in cgroup.c for
clarity, rather than directly unlocking cgroup_mutex.
- make the return type of cgroup_lock_live_group() a bool
- use a #define'd constant for the local buffer size in read/write functions
Signed-off-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adds cgroup_release_agent_write() and cgroup_release_agent_show()
methods to handle writing/reading the path to a cgroup hierarchy's
release agent. As a result, cgroup_common_file_read() is now unnecessary.
As part of the change, a previously-tolerated race in
cgroup_release_agent() is avoided by copying the current
release_agent_path prior to calling call_usermode_helper().
Signed-off-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds a write_string() method for cgroups control files. The
semantics are that a buffer is copied from userspace to kernelspace
and the handler function invoked on that buffer. The buffer is
guaranteed to be nul-terminated, and no longer than max_write_len
(defaulting to 64 bytes if unspecified). Later patches will convert
existing raw file write handlers in control group subsystems to use
this method.
Signed-off-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Balbir Singh <balbir@in.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes some extraneous spaces from method declarations in
struct cftype, to fit in with conventional kernel style.
Signed-off-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ignoring their return values may result in counter underflow in the future -
when the value charged will be uncharged (or in "leaks" - when the value is
not uncharged).
This also prevents from using charging routines to decrement the
counter value (i.e. uncharge it) ;)
(Current code works OK with res_counter, however :) )
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sometimes it may be useful for userspace to know (e.g. for some hosting
guys) that some user stopped exceeding his hardlimit or softlimit in
quotas. Implement sending of such events to userspace via quota netlink
protocol so that they don't have to poll for such events. Based on idea
and initial implementation by Vladislav Bogdanov.
Cc: Vladislav Bogdanov <slava@nsys.by>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move declarations of some macros, which should be in fact functions to
quotaops.h. This way they can be later converted to inline functions
because we can now use declarations from quota.h. Also add necessary
includes of quotaops.h to a few files.
[akpm@linux-foundation.org: fix JFS build]
[akpm@linux-foundation.org: fix UFS build]
[vegard.nossum@gmail.com: fix QUOTA=n build]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Arjen Pool <arjenpool@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make loop in sync_dquots() checking whether there's something to write
more readable, remove useless variable and macro info_any_dirty() which
is used only in this place.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: "Vegard Nossum" <vegard.nossum@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cleanup quotaops.h: Rename functions from uppercase to lowercase (and
define backward compatibility macros), move larger functions to dquot.c
and make them non-inline.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide a new mount option ("tz=UTC") for DOS (vfat/msdos) filesystems,
allowing timestamps to be in coordinated universal time (UTC) rather than
local time in applications where doing this is advantageous.
In particular, portable devices that use fat/vfat (such as digital
cameras) can benefit from using UTC in their internal clocks, thus
avoiding daylight saving time errors and general time ambiguity issues.
The user of the device does not have to worry about changing the time when
moving from place or when daylight saving changes.
The new mount option, when set, disables the counter-adjustment that Linux
currently makes to FAT timestamp info in anticipation of the normal
userspace time zone correction. When used in this new mode, all daylight
saving time and time zone handling is done in userspace as is normal for
many other filesystems (like ext3). The default mode, which remains
unchanged, is still appropriate when mounting volumes written in Windows
(because of its use of local time).
I originally based this patch on one submitted last year by Paul Collins,
but I updated it to work with current source and changed variable/option
naming. Ogawa Hirofumi (who maintains these filesystems) and I discussed
this patch at length on lkml, and he suggested using the option name in
the attached version of the patch. Barry Bouwsma pointed out a good
addition to the patch as well.
Signed-off-by: Joe Peterson <joe@skyrush.com>
Signed-off-by: Paul Collins <paul@ondioline.org>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Barry Bouwsma <free_beer_for_all@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The kernel struct dirent{,64} were different from the ones in
userspace.
Even worse, we exported the kernel ones to userspace.
But after the fat usages are fixed we can remove the conflicting
kernel versions.
Reviewed-by: H. Peter Anvin <hpa@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It has been impossible to set the option 'atari' of the MSDOS filesystem
for several years. Since nobody seems to have missed it, let's remove its
remains.
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
"struct dirent" is a kernel type here, but is a **different type** in
userspace! This means both the structure and the IOCTL number is wrong!
So, this adds new "struct __fat_dirent" to generate correct IOCTL number.
And kernel stuff moves to under __KERNEL__.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
j_commit_lock is a semaphore but uses it as if it were a mutex. This patch
converts it to a mutex.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Edward Shishkin <edward.shishkin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
j_flush_sem is a semaphore but uses it as if it were a mutex. This patch
converts it to a mutex.
[akpm@linux-foundation.org: fix mutex_trylock retval treatment]
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Edward Shishkin <edward.shishkin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
j_lock is a semaphore but uses it as if it were a mutex. This patch converts
it to a mutex.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Edward Shishkin <edward.shishkin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While fixing CONFIG_ leakages to the userspace kernel headers I ran into
CODA_FS_OLD_API.
After five years, are there still people using the old API left?
Especially considering that you have to choose at compile time which API
to support in the kernel (and distributions tend to offer the new API for
some time).
Jan: "The old API can definitely go. Around the time the new
interface went in there were some non-Coda userspace file system
implementations that took a while longer to convert to the new API,
but by now they all switched to the new interface or in some cases
to a FUSE-based solution."
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the orphan node list includes valid, untruncatable nodes with nlink > 0
the ext3_orphan_cleanup loop which attempts to delete them will not do so,
causing it to loop forever. Fix by checking for such nodes in the
ext3_orphan_get function.
This patch fixes the second case (image hdb.20000009.softlockup.gz)
reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: printk warning fix]
Signed-off-by: Duane Griffin <duaneg@dghda.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix typo in Hurd part of include/linux/ext2_fs.h
The ';' here is redundant or can even pose problem. This is actually not
used by the Linux kernel, but it is exposed in GNU/Hurd.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds a driver supporting a family of I2C port expanders from Maxim,
which includes the MAX7319 and MAX7320-7327 chips.
[dbrownell@users.sourceforge.net: minor fixes]
Signed-off-by: Jack Ren <jack.ren@marvell.com>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds functionality to the gpio-lib subsystem to make it
possible to enable the gpio-lib code even if the architecture code didn't
request to get it built in.
The archtitecture code does still need to implement the gpiolib accessor
functions in its asm/gpio.h file. This patch adds the implementations for
x86 and PPC.
With these changes it is possible to run generic GPIO expansion cards on
every architecture that implements the trivial wrapper functions. Support
for more architectures can easily be added.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: David Brownell <david-b@pacbell.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Samuel Ortiz <sameo@openedhand.com>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Teach the mcp23s08 driver about a curious feature of these chips: up to
four of them can share the same chipselect, with the SPI signals wired in
parallel, by matching two bits in the first protocol byte against two
address lines on the chip.
This is handled by three software changes:
* Platform data now holds an array of per-chip structs, not
just one chip's address and pullup configuration.
* Probe() and remove() now use another level of structure,
wrapping an instance of the original structure for each
mcp23s08 chip sharing that chipselect.
* The HAEN bit is set, so that the hardware address bits can no
longer be ignored (boot firmware may not have enabled them).
The "one struct per chip" preserves the guts of the current code,
but platform_data will need minor changes.
OLD:
/* incorrect "slave" ID may not have mattered */
.slave = 3,
.pullups = BIT(3) | BIT(1) | BIT(0),
NEW:
/* slave address _must_ match chip's wiring */
.chip[3] = {
.is_present = true,
.pullups = BIT(3) | BIT(1) | BIT(0),
},
There's no change in how things _behave_ for spi_device nodes with a
single mcp23s08 chip. New multi-chip configurations assign GPIOs in
sequence, without holes. The spi_device just resembles a bigger
controller, but internally it has multiple gpio_chip instances.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds a simple sysfs interface for GPIOs.
/sys/class/gpio
/export ... asks the kernel to export a GPIO to userspace
/unexport ... to return a GPIO to the kernel
/gpioN ... for each exported GPIO #N
/value ... always readable, writes fail for input GPIOs
/direction ... r/w as: in, out (default low); write high, low
/gpiochipN ... for each gpiochip; #N is its first GPIO
/base ... (r/o) same as N
/label ... (r/o) descriptive, not necessarily unique
/ngpio ... (r/o) number of GPIOs; numbered N .. N+(ngpio - 1)
GPIOs claimed by kernel code may be exported by its owner using a new
gpio_export() call, which should be most useful for driver debugging.
Such exports may optionally be done without a "direction" attribute.
Userspace may ask to take over a GPIO by writing to a sysfs control file,
helping to cope with incomplete board support or other "one-off"
requirements that don't merit full kernel support:
echo 23 > /sys/class/gpio/export
... will gpio_request(23, "sysfs") and gpio_export(23);
use /sys/class/gpio/gpio-23/direction to (re)configure it,
when that GPIO can be used as both input and output.
echo 23 > /sys/class/gpio/unexport
... will gpio_free(23), when it was exported as above
The extra D-space footprint is a few hundred bytes, except for the sysfs
resources associated with each exported GPIO. The additional I-space
footprint is about two thirds of the current size of gpiolib (!). Since
no /dev node creation is involved, no "udev" support is needed.
Related changes:
* This adds a device pointer to "struct gpio_chip". When GPIO
providers initialize that, sysfs gpio class devices become children of
that device instead of being "virtual" devices.
* The (few) gpio_chip providers which have such a device node have
been updated.
* Some gpio_chip drivers also needed to update their module "owner"
field ... for which missing kerneldoc was added.
* Some gpio_chips don't support input GPIOs. Those GPIOs are now
flagged appropriately when the chip is registered.
Based on previous patches, and discussion both on and off LKML.
A Documentation/ABI/testing/sysfs-gpio update is ready to submit once this
merges to mainline.
[akpm@linux-foundation.org: a few maintenance build fixes]
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Cc: Greg KH <greg@kroah.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently list of kretprobe instances are stored in kretprobe object (as
used_instances,free_instances) and in kretprobe hash table. We have one
global kretprobe lock to serialise the access to these lists. This causes
only one kretprobe handler to execute at a time. Hence affects system
performance, particularly on SMP systems and when return probe is set on
lot of functions (like on all systemcalls).
Solution proposed here gives fine-grain locks that performs better on SMP
system compared to present kretprobe implementation.
Solution:
1) Instead of having one global lock to protect kretprobe instances
present in kretprobe object and kretprobe hash table. We will have
two locks, one lock for protecting kretprobe hash table and another
lock for kretporbe object.
2) We hold lock present in kretprobe object while we modify kretprobe
instance in kretprobe object and we hold per-hash-list lock while
modifying kretprobe instances present in that hash list. To prevent
deadlock, we never grab a per-hash-list lock while holding a kretprobe
lock.
3) We can remove used_instances from struct kretprobe, as we can
track used instances of kretprobe instances using kretprobe hash
table.
Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system
with return probes set on all systemcalls looks like this.
cacheline non-cacheline Un-patched kernel
aligned patch aligned patch
===============================================================================
real 9m46.784s 9m54.412s 10m2.450s
user 40m5.715s 40m7.142s 40m4.273s
sys 2m57.754s 2m58.583s 3m17.430s
===========================================================
Time duration for kernel compilation ("make -j 8) on the same system, when
kernel is not probed.
=========================
real 9m26.389s
user 40m8.775s
sys 2m7.283s
=========================
Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for adding the GPIO based I2C resources.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <apatard@mandriva.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The SM501 PCI card requires a dyanmic gpio allocation as the number of
cards is not known at compile time. Fixup the platform data and
registration to deal with this.
Acked-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Arnaud Patard <apatard@mandriva.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for exporting the GPIOs on the SM501 via gpiolib.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <apatard@mandriva.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add callback to get or set the power control if the device has the sleep
connected to some form of GPIO.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <apatard@mandriva.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All ratelimit user use same jiffies and burst params, so some messages
(callbacks) will be lost.
For example:
a call printk_ratelimit(5 * HZ, 1)
b call printk_ratelimit(5 * HZ, 1) before the 5*HZ timeout of a, then b will
will be supressed.
- rewrite __ratelimit, and use a ratelimit_state as parameter. Thanks for
hints from andrew.
- Add WARN_ON_RATELIMIT, update rcupreempt.h
- remove __printk_ratelimit
- use __ratelimit in net_ratelimit
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the %p format string which already accounts for the padding you need
with a pointer type on a particular architecture.
Also replace the macro with a static inline function to match the rest of
the file.
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a WARN() macro that acts like WARN_ON(), with the added feature that it
takes a printk like argument that is printed as part of the warning message.
[akpm@linux-foundation.org: fix printk arguments]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We want to use WARN() as a variant of WARN_ON(), however a few drivers are
using WARN() internally. This patch renames these to WARNING() to avoid the
namespace clash. A few cases were defining but not using the thing, for those
cases I just deleted the definition.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Greg KH <greg@kroah.com>
Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove apparently obsolete content from init.h referring to gcc 2.9x
and to "no_module_init".
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We duplicate alloc/free_thread_info defines on many platforms (the
majority uses __get_free_pages/free_pages). This patch defines common
defines and removes these duplicated defines.
__HAVE_ARCH_THREAD_INFO_ALLOCATOR is introduced for platforms that do
something different.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Presently call_usermodehelper_setup() uses GFP_ATOMIC. but it can return
NULL _very_ easily.
GFP_ATOMIC is needed only when we can't sleep. and, GFP_KERNEL is robust
and better.
thus, I add gfp_mask argument to call_usermodehelper_setup().
So, its callers pass the gfp_t as below:
call_usermodehelper() and call_usermodehelper_keys():
depend on 'wait' argument.
call_usermodehelper_pipe():
always GFP_KERNEL because always run under process context.
orderly_poweroff():
pass to GFP_ATOMIC because may run under interrupt context.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: "Paul Menage" <menage@google.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several compilers offer "long long" without claiming to support C99.
Considering how frequent __s64/__u64 are used our userspace headers are
anyway unusable without __s64/__u64 available.
Always offer __s64/__u64 to non-gcc non-C99 compilers - if they provide
"long long" that makes the headers compiling and if they don't they are
anyway screwed.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Build kernel/profile.o only if CONFIG_PROFILING is enabled.
This makes CONFIG_PROFILING=n kernels smaller.
As a bonus, some profile_tick() calls and one branch from schedule() are
now eliminated with CONFIG_PROFILING=n (but I doubt these are
measurable effects).
This patch changes the effects of CONFIG_PROFILING=n, but I don't think
having more than two choices would be the better choice.
This patch also adds the name of the first parameter to the prototypes
of profile_{hits,tick}() since I anyway had to add them for the dummy
functions.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the conditional surrounding the definition of list_add() from list.h
since, if you define CONFIG_DEBUG_LIST, the definition you will subsequently
pick up from lib/list_debug.c will be absolutely identical, at which point you
can remove that redundant definition from list_debug.c as well.
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This header file has been unused for quite some time, and the
corresponding source files appear to have been removed back in commit
99eb8a550d ("Remove the arm26 port")
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Ian Molton <spyro@f2s.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There haave been several areas in the kernel where an int has been used for
flags in local_irq_save() and friends instead of a long. This can cause some
hard to debug problems on some architectures.
This patch adds a typecheck inside the irqsave and restore functions to flag
these cases.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Needed to fix up a recursive include snafu in
locking-add-typecheck-on-irqsave-and-friends-for-correct-flags.patch
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add fields for VLAN tag and insert VLAN tag flag to the control
section struct. These fields will be used for sending ethernet
packets.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Changeset 7fa897b91a ("ide: trivial sparse
annotations") created an IDE bootup regression on big-endian systems.
In drivers/ide/ide-iops.c, function ide_fixstring() we now have the
loop:
for (p = end ; p != s;)
be16_to_cpus((u16 *)(p -= 2));
which will never terminate on big-endian because in such
a configuration be16_to_cpus() evaluates to "do { } while (0)"
Therefore, always evaluate the arguments to nop endian transformation
operations.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch enables NAND subpage read functionality.
If upper layer drivers are requesting to read non page aligned data NAND
subpage-read functionality reads the only whose ECC regions which include
requested data when original code reads whole page.
This significantly improves performance in many cases.
Here are some digits :
UBI volume mount time
No subpage reads: 5.75 seconds
Subpage read patch: 2.42 seconds
Open/stat time for files on JFFS2 volume:
No subpage read 0m 5.36s
Subpage read 0m 2.88s
Signed-off-by Alexey Korolev <akorolev@infradead.org>
Acked-by: Artem Bityutskiy <dedekind@infradead.org>
Acked-by: Jörn Engel <joern@logfs.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Isochronous reception in dualbuffer mode is reportedly broken with
TI TSB43AB22A on x86-64. Descriptor addresses above 2G have been
determined as the trigger:
https://bugzilla.redhat.com/show_bug.cgi?id=435550
Two fixes are possible:
- pci_set_consistent_dma_mask(pdev, DMA_31BIT_MASK);
at least when IR descriptors are allocated, or
- simply don't use dualbuffer.
This fix implements the latter workaround.
But we keep using dualbuffer on x86-32 which won't give us highmen (and
thus physical addresses outside the 31bit range) in coherent DMA memory
allocations. Right now we could for example also whitelist PPC32, but
DMA mapping implementation details are expected to change there.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
This patch merges the IPv4/IPv6 IPComp implementations since most
of the code is identical. As a result future enhancements will no
longer need to be duplicated.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
signalfd4, eventfd2, epoll_create1, dup3, pipe2 and inotify_init1
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is a large patch but the normal code path is not affected. For
non-pSeries platforms the code is ifdef'ed out and for non-CMO enabled
pSeries systems this does not affect the normal code path. Devices that
do not perform DMA operations do not need modification with this patch.
The function get_desired_dma was renamed from get_io_entitlement for
clarity.
Overview
Cooperative Memory Overcommitment (CMO) allows for a set of OS partitions
to be run with less RAM than the aggregate needs of the group of
partitions. The firmware will balance memory between the partitions
and page in/out memory as needed. Based on the number and type of IO
adpaters preset each partition is allocated an amount of memory for
DMA operations and this allocation will be guaranteed to the partition;
this is referred to as the partition's 'entitlement'.
Partitions running in a CMO environment can only have virtual IO devices
present. The VIO bus layer will manage the IO entitlement for the system.
Accounting, at a system and per-device level, is tracked in the VIO bus
code and exposed via sysfs. A set of dma_ops functions are added to
the bus to allow for this accounting.
Bus initialization
At initialization, the bus will calculate the minimum needs of the system
based on providing each device present with a standard minimum entitlement
along with a spare allocation for the bus to handle hotplug events.
If the minimum needs can not be met the system boot will be halted.
Device changes
The significant changes for devices while running under CMO are that the
devices must specify how much dedicated IO entitlement they desire and
must also handle DMA mapping errors that can occur due to constrained
IO memory. The virtual IO drivers are modified to silence errors when
DMA mappings fail for CMO and handle these failures gracefully.
Each devices will be guaranteed a minimum entitlement that can always
be mapped. Devices will specify how much entitlement they desire and
the VIO bus will attempt to provide for this. Devices can change their
desired entitlement level at any point in time to address particular needs
(via vio_cmo_set_dev_desired()), not just at device probe time.
VIO bus changes
The system will have a particular entitlement level available from which
it can provide memory to the devices. The bus defines two pools of memory
within this entitlement, the reserved and excess pools. Each device is
provided with it's own entitlement no less than a system defined minimum
entitlement and no greater than what the device has specified as it's
desired entitlement. The entitlement provided to devices comes from the
reserve pool. The reserve pool can also contain a spare allocation as
large as the system defined minimum entitlement which is used for device
hotplug events. Any entitlement not needed to fulfill the needs of a
reserve pool is placed in the excess pool. Each device is guaranteed
that it can map up to it's entitled level; additional mapping are possible
as long as there is unmapped memory in the excess pool.
Bus probe
As the system starts, each device is given an entitlement equal only
to the system defined minimum entitlement. The reserve pool is equal
to the sum of these entitlements, plus a spare allocation. The VIO bus
also tracks the aggregate desired entitlement of all the devices. If the
system desired entitlement is greater than the size of the reserve pool,
when devices unmap IO memory it will be reserved and a balance operation
will be scheduled for some time in the future.
Entitlement balancing
The balance function tries to fairly distribute entitlement between the
devices in the system with the goal of providing each device with it's
desired amount of entitlement. Devices using more than what would be
ideal will have their entitled set-point adjusted; this will effectively
set a goal for lower IO memory usage as future mappings can fail and
deallocations will trigger a balance operation to distribute the newly
unmapped memory. A fair distribution of entitlement can take several
balance operations to achieve. Entitlement changes and device DLPAR
events will alter the state of CMO and will trigger balance operations.
Hotplug events
The VIO bus allows for changes in system entitlement at run-time via
'vio_cmo_entitlement_update()'. When devices are added the hotplug
device event will be preceded by a system entitlement increase and this
is reversed when devices are removed.
The following changes are made that the VIO bus layer for CMO:
* add IO memory accounting per device structure.
* add IO memory entitlement query function to driver structure.
* during vio bus probe, if CMO is enabled, check that driver has
memory entitlement query function defined. Fail if function not defined.
* fail to register driver if io entitlement function not defined.
* create set of dma_ops at vio level for CMO that will track allocations
and return DMA failures once entitlement is reached. Entitlement will
limited by overall system entitlement. Devices will have a reserved
quantity of memory that is guaranteed, the rest can be used as available.
* expose entitlement, current allocation, desired allocation, and the
allocation error counter for devices to the user through sysfs
* provide mechanism for changing a device's desired entitlement at run time
for devices as an exported function and sysfs tunable
* track any DMA failures for entitled IO memory for each vio device.
* check entitlement against available system entitlement on device add
* track entitlement metrics (high water mark, current usage)
* provide function to reset high water mark
* provide minimum and desired entitlement numbers at a bus level
* provide drivers with a minimum guaranteed entitlement
* balance available entitlement between devices to satisfy their needs
* handle system entitlement changes and device hotplug
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To support Cooperative Memory Overcommitment (CMO), we need to check
for failure from some of the tce hcalls.
These changes for the pseries platform affect the powerpc architecture;
patches for the other affected platforms are included in this patch.
pSeries platform IOMMU code changes:
* platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and
return an error.
Architecture IOMMU code changes:
* Calls to ppc_md.tce_build need to check return values and return
DMA_MAPPING_ERROR for transient errors.
Architecture changes:
* struct machdep_calls for tce_build*_pSeriesLP functions need to change
to indicate failure.
* all other platforms will need updates to iommu functions to match the new
calling semantics; they will return 0 on success. The other platforms
default configs have been built, but no further testing was performed.
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With the addition of Cooperative Memory Overcommitment (CMO) support
for IBM Power Systems, two fields have been added to the VPA to report
paging statistics. Add support in lparcfg to report them to userspace.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Newer versions of firmware support page states, which are used by the
collaborative memory manager (future patch) to "loan" pages to the
hypervisor for use by other partitions.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For Cooperative Memory Overcommitment (CMO), set the FW_FEATURE_CMO
flag in powerpc_firmware_features from the rtas ibm,get-system-parameters
table prior to calling iommu_init_early_pSeries.
With this, any CMO specific functionality can be controlled by checking:
firmware_has_feature(FW_FEATURE_CMO)
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Update /proc/ppc64/lparcfg to display Cooperative Memory
Overcommitment statistics as reported by the H_GET_MPP hcall. This
also updates the lparcfg interface to allow setting memory entitlement
and weight.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch implements support for HW based watchpoint via the
DBSR_DAC (Data Address Compare) facility of the BookE processors.
It does so by interfacing with the existing DABR breakpoint code
and adding the necessary bits and pieces for the new bits to
be properly set or cleared
Signed-off-by: Luis Machado <luisgpm@br.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Stash the first platform string matched by identify_cpu() in
powerpc_base_platform, and supply that to the ELF loader for the value
of AT_BASE_PLATFORM.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Some IBM POWER-based platforms have the ability to run in a
mode which mostly appears to the OS as a different processor from the
actual hardware. For example, a Power6 system may appear to be a
Power5+, which makes the AT_PLATFORM value "power5+". This means that
programs are restricted to the ISA supported by Power5+;
Power6-specific instructions are treated as illegal.
However, some applications (virtual machines, optimized libraries) can
benefit from knowledge of the underlying CPU model. A new aux vector
entry, AT_BASE_PLATFORM, will denote the actual hardware. For
example, on a Power6 system in Power5+ compatibility mode, AT_PLATFORM
will be "power5+" and AT_BASE_PLATFORM will be "power6". The idea is
that AT_PLATFORM indicates the instruction set supported, while
AT_BASE_PLATFORM indicates the underlying microarchitecture.
If the architecture has defined ELF_BASE_PLATFORM, copy that value to
the user stack in the same manner as ELF_PLATFORM.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
virtio: Add transport feature handling stub for virtio_ring.
virtio: Rename set_features to finalize_features
virtio: Formally reserve bits 28-31 to be 'transport' features.
s390: use virtio_console for KVM on s390
virtio: console as a config option
virtio_console: use virtqueue notification for hvc_console
hvc_console: rework setup to replace irq functions with callbacks
virtio_blk: check for hardsector size from host
virtio: Use bus_type probe and remove methods
virtio: don't always force a notification when ring is full
virtio: clarify that ABI is usable by any implementations
virtio: Recycle unused recv buffer pages for large skbs in net driver
virtio net: Allow receiving SG packets
virtio net: Add ethtool ops for SG/GSO
virtio: fix virtio_net xmit of freed skb bug
Obvious misc patch been in my queue (& linux-next) for over a cycle.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To prepare for virtio_ring transport feature bits, hook in a call in
all the users to manipulate them. This currently just clears all the
bits, since it doesn't understand any features.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Rather than explicitly handing the features to the lower-level, we just
hand the virtio_device and have it set the features. This make it clear
that it has the chance to manipulate the features of the device at this
point (and that all feature negotiation is already done).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We assign feature bits as required, but it makes sense to reserve some
for the particular transport, rather than the particular device.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This patch enables virtio_console as the default console on kvm for
s390. We currently use the same notify hack as lguest for early
console output. I will try to address this for lguest and s390 later.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Currently virtio_blk assumes a 512 byte hard sector size. This can cause
trouble / performance issues if the backing has a different block size
(like a file on an ext3 file system formatted with 4k block size or a dasd).
Lets add a feature flag that tells the guest to use a different hard sector
size than 512 byte.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We want others to implement and use virtio, so it makes sense to BSD
license the non-__KERNEL__ parts of the headers to make this crystal
clear.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Mark McLoughlin <markmc@redhat.com>
Acked-by: Ryan Harper <ryanh@us.ibm.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Suresh Siddha wants to fix a possible FPU leakage in error conditions,
but the fact that save/restore_i387() are inlines in a header file makes
that harder to do than necessary. So start off with an obvious cleanup.
This just moves the x86-64 version of save/restore_i387() out of the
header file, and moves it to the only file that it is actually used in:
arch/x86/kernel/signal_64.c. So exposing it in a header file was wrong
to begin with.
[ Side note: I'd like to fix up some of the games we play with the
32-bit version of these functions too, but that's a separate
matter. The 32-bit versions are shared - under different names
at that! - by both the native x86-32 code and the x86-64 32-bit
compatibility code ]
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
ide: use proper printk() KERN_* levels in ide-probe.c
ide: fix for EATA SCSI HBA in ATA emulating mode
ide: remove stale comments from drivers/ide/Makefile
ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase
ide-scsi: remove kmalloced struct request
ht6560b: remove old history
ht6560b: update email address
ide-cd: fix oops when using growisofs
gayle: release resources on ide_host_add() failure
palm_bk3710: add UltraDMA/100 support
ide: trivial sparse annotations
ide: ide-tape.c sparse annotations and unaligned access removal
ide: drop 'name' parameter from ->init_chipset method
ide: prefix messages from IDE PCI host drivers by driver name
it821x: remove DECLARE_ITE_DEV() macro
it8213: remove DECLARE_ITE_DEV() macro
ide: include PCI device name in messages from IDE PCI host drivers
ide: remove <asm/ide.h> for some archs
ide-generic: remove ide_default_{io_base,irq}() inlines (take 3)
ide-generic: is no longer needed on ppc32
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: fixup sparse endianness warnings in proc.c
PCI PM: make more PCI PM core functionality available to drivers
PCI/DMAR: don't assume presence of RMRRs
PCI hotplug: fix error path in pci_slot's register_slot
* Remove <linux/irq.h> include from <asm-ia64.h> (<linux/ide.h> includes
<linux/interrupt.h> which is enough).
* Remove <asm/ide.h> for alpha/blackfin/h8300/ia64/m32r/sh/x86/xtensa
(this leaves us with arm/frv/m68k/mips/mn10300/parisc/powerpc/sparc[64]).
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Replace ide_default_{io_base,irq}() inlines by legacy_{bases,irqs}[].
v2:
Add missing zero-ing of hws[] (caught during testing by Borislav Petkov).
v3:
Fix zero-oing of hws[] for _real_ this time.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
PPC_PREP has been depending on BROKEN for some time now.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Now that ide_hwif_t instances are allocated dynamically
the difference between MAX_HWIFS == 2 and MAX_HWIFS == 10
is ~100 bytes (x86-32) so use MAX_HWIFS == 10 on all archs
except these ones that use MAX_HWIFS == 1.
* Define MAX_HWIFS in <linux/ide.h> instead of <asm/ide.h>.
[ Please note that avr32/cris/v850 have no <asm/ide.h>
and alpha/ia64/sh always define CONFIG_IDE_MAX_HWIFS. ]
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove <asm-cris/arch-v{10,32}/ide.h> and <asm-cris/ide.h>.
This has been a broken code for some time now and needs rewrite
to match IDE core code / host driver model anyway.
Cc: Jesper Nilsson <Jesper.Nilsson@axis.com>
Cc: Mikael Starvik <mikael.starvik@axis.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Since the decision to probe for ISA ide2-6 is now left to the user
"no_pci_devices()" quirk is no longer needed and may be removed.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Move ide_probe_legacy() call to ide_generic_init() so it fails
early if necessary and returns the proper error value (nowadays
ide_default_io_base() is used only by ide-generic).
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'unsigned long host_flags' field to struct ide_host.
* Set ->host_flags in ide_host_alloc_all().
* Always set PCI dev's ->driver_data in ide_pci_init_{one,two}().
* Add ide_pci_remove() helper (the default implementation for
struct pci_driver's ->remove method).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'struct ide_host *host' field to ide_hwif_t and set it
in ide_host_alloc_all().
* Add ide_device_{get,put}() helpers loosely based on SCSI's
scsi_device_{get,put}() ones.
* Convert IDE device drivers to use ide_device_{get,put}().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'struct device *dev[2]' and 'void *host_priv' fields
to struct ide_host.
* Set ->dev[] in ide_host_alloc_all()/ide_setup_pci_device[s]().
* Pass 'void *priv' argument to ide_setup_pci_device[s]()
and use it to set ->host_priv.
* Set PCI dev's ->driver_data to point to the struct ide_host
instance if PCI host driver wants to use ->host_priv.
* Rename ide_setup_pci_device[s]() to ide_pci_init_{one,two}().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
MAINTAINERS: Remove Glenn Streiff from NetEffect entry
mlx4_core: Improve error message when not enough UAR pages are available
IB/mlx4: Add support for memory management extensions and local DMA L_Key
IB/mthca: Keep free count for MTT buddy allocator
mlx4_core: Keep free count for MTT buddy allocator
mlx4_code: Add missing FW status return code
IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg
mlx4_core: Add module parameter to enable QoS support
RDMA/iwcm: Remove IB_ACCESS_LOCAL_WRITE from remote QP attributes
IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures
IB/sa_query: Check if sm_ah is NULL in ib_sa_remove_one()
IB/ehca: Release mutex in error path of alloc_small_queue_page()
IB/ehca: Use default value for Local CA ACK Delay if FW returns 0
IB/ehca: Filter PATH_MIG events if QP was never armed
IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event
RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event
RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
nohz: adjust tick_nohz_stop_sched_tick() call of s390 as well
nohz: prevent tick stop outside of the idle loop
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix header export, asm-x86/processor-flags.h, CONFIG_* leaks
x86: BUILD_IRQ say .text to avoid .data.percpu
xen: don't use sysret for sysexit32
x86: call early_cpu_init at the same point
* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
Remove __DECLARE_SEMAPHORE_GENERIC
Remove asm/semaphore.h
Remove use of asm/semaphore.h
Add missing semaphore.h includes
Remove mention of semaphores from kernel-locking
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68knommu: put ColdFire head code into .text.head section
m68knommu: remove last use of CONFIG_FADS and CONFIG_RPXCLASSIC
m68knommu: remove RPXCLASSIC from the m68k tree
m68knommu: fec: remove FADS
m68knommu: MCF5307 PIT GENERIC_CLOCKEVENTS support
m68knommu: add read_barrier_depends() and irqs_disabled_flags()
m68knommu: add byteswap assembly opcode for ISA A+
m68knommu: add ffs and __ffs plattform which support ISA A+ or ISA C
m68knommu: add sched_clock() for the DMA timer
m68knommu: complete generic time
m68knommu: move code within time.c
m68knommu: m68knommu: add old stack trace method
m68knommu: Add Coldfire DMA Timer support
m68knommu: defconfig for M5407C3 board
m68knommu: defconfig for M5307C3 board
m68knommu: defconfig for M5275EVB board
m68knommu: defconfig for M5249EVB board
m68knommu: change to a configs directory for board configurations
* 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds:
leds: Ensure led->trigger is set earlier
leds: Add support for Philips PCA955x I2C LED drivers
leds: Fix sparse warnings in leds-h1940 driver
leds: mark led_classdev.default_trigger as const
leds: fix unsigned value overflow in atmel pwm driver
leds: Add pca9532 platform data for Thecus N2100
leds: Add pca9532 led driver
This patch adds a "const" to the parser token table. I've done an
allmodconfig build to see if this produces any warnings/failures and the
patch includes a fix for the only warning that was produced.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Alexander Viro <aviro@redhat.com>
Acked-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, linux/major.h defines a GRAPHDEV_MAJOR (29) that nobody uses,
and linux/fb.h defines the real FB_MAJOR (also 29), that only fbmem.c
needs. Drop GRAPHDEV_MAJOR from major.h, move FB_MAJOR definition from
fb.h to major.h, and fix fbmem.c to use major.h's definition.
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds a platform driver using the ATMEL PWM driver to control a
backlight which requires a PWM signal and optional GPIO signal for discrete
on/off signal. It has been tested on Favr-32 board from EarthLCD.
The driver is configurable by supplying a struct with the platform data. See
the include/linux/atmel-pwm-bl.h for details.
The board code for Favr-32 will be submitted to the AVR32 kernel list.
Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the xtimings structure which only stored some values to be used
later (mostly once). Calculate and use these values in places they are
needed.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a platform_lcd driver to allow boards with simple lcd power controls
to register themselves easily.
[akpm@linux-foundation.org: build fix]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the lcd_device being checked to the check_fb entry of lcd_ops. This
ensures that any driver using this to check against it's own state can do
so, and also makes all the calls in lcd_ops more orthogonal in their
arguments.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide support for the ILI9320 display controller chip which is found in
many LCD displays. Included with this is support for an example LCD using
this chip, the VGG2432A4.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add flags to allow the driver to invert the sense of both VBIASEN and FPEN
signals comming from the SM501.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is the SuperH Mobile LCDC frame buffer driver V2, adding support for
the LCDC block found in SuperH Mobile processors. The hardware supports
up to two LCD panels per LCDC block, and both RGB and SYS interfaces can
be used to hook up LCD panels/modules.
The device driver is a regular platform driver, so LCD configuration and
board specific hooks are passed to the driver using platform data. LCD
modules using SYS interface often require special configuration using the
SYS bus, and to solve this cleanly the driver provides SYS interface
operations to the board code.
Tested on sh7723 and sh7722 processors with a SYS16A QVGA panel and WVGA
panels using RGB16 and RGB18 interfaces.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Reviewed-by: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Manage atmel_lcdfb FIFO underflow
Resetting the LCD and DMA allows to fix screen shifting after a FIFO
underflow. It follows reset sequence from errata "LCD Screen Shifting
After a Reset".
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add imageblit acceleration for the Blade3D family of cores. The code is
based on code from the cyblafb driver.
It is a step toward assimilating back the cyblafb driver into the
tridentfb driver. The cyblafb driver handles a subfamily of the Trident
Blade3d cores.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch contains general source code improvments:
- more simple functions are inline
- removes some meaningless output and the VERSION
string as it is no use
- eng_par is moved into the tridentfb_par
- removed small section of code for CyberBladeXPAi1
which is maybe right for only one resolution
and refresh rate and is probably redundant now
- other minor improvements
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch replaces deprecated constant FB_ACCELF_TEXT with
FBINFO_HWACCEL_DISABLED and adds constants for Trident families of
accelerators.
The FBINFO_HWACCEL_DISABLED is correctly used so noaccel parameter works
now.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch brings various acceleration improvements:
- set copyarea/fillrect for non-accelerated framebuffer (fix)
- remove 15 bpp depth handling to simplify code as it hardly
works (15 bpp handling was obviously missing in some switches)
- add fb_sync call and move waiting before accelerated function
to make acceleration more asynchronous to cpu (few % of speed
improvement)
- add cpu_relax() call in waiting loops
- make longer register names and name more registers
- move registers' definition to header
- general code improvements (shortening, simplifying)
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add support for TGUI 9440 chip.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Improved values for some registers after Xorg Trident driver. The main
problem was that values set by BIOS have been ignored.
This patch completely remove random pixels ("snow") on the TGUI 9680 and
9440 (not supported yet by the driver). It does not help with the "snow"
on 3DImage and Blade3D cards.
There is also small improvement in timing calculations (hblank start and
vblank start)
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Make use of functions and constants from the vga.h header to compact the code
and make it more readable.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch converts the is_blade() and is_xp() macros into local functions.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch moves flat panel indicator into tridentfb_par structure and removes
related global variables and macros.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This updates <linux/bcd.h> to define the key routines as constant
functions, which the macros will then call. Newer code can now call
bcd2bin() instead of SCREAMING BCD2BIN() TO THE FOUR WINDS.
This lets each driver shrink their codespace by using N function calls to
a single (global) copy of those routines, instead of N inlined copies of
these functions per driver.
These routines aren't used in speed-critical code. Almost all callers are
in the RTC framework. Typical per-driver savings is near 300 bytes.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Adrian Bunk <bunk@kernel.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Support the Dallas/Maxim DS1305 and DS1306 RTC chips. These use SPI, and
support alarms, NVRAM, and a trickle charger for use when their backup
power supply is a supercap or rechargeable cell.
This basic driver doesn't yet support suspend/resume or wakealarms.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove implicit use of BKL in ioctl() from the RTC framework.
Instead, the rtc->ops_lock is used. That's the same lock that already
protects the RTC operations when they're issued through the exported
rtc_*() calls in drivers/rtc/interface.c ... making this a bugfix, not
just a cleanup, since both ioctl calls and set_alarm() need to update IRQ
enable flags and that implies a common lock (which RTC drivers as a rule
do not provide on their own).
A new comment at the declaration of "struct rtc_class_ops" summarizes
current locking rules. It's not clear to me that the exceptions listed
there should exist ... if not, those are pre-existing problems which can
be fixed in a patch that doesn't relate to BKL removal.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Jonathan Corbet <corbet@lwn.net>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ioctls AUTOFS_IOC_TOGGLEREGHOST and AUTOFS_IOC_ASKREGHOST were added
several years ago but what they were intended for has never been
implemented (as far as I'm aware noone uses them) so remove them.
Signed-off-by: Ian Kent <raven@themaw.net>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the Au1550 resource table and instead extract MMIO/IRQ/DMA
resources from platform resource information like any well-behaved
platform driver.
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Jan Nikitenko <jan.nikitenko@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently, 'modalias' in the spi_device structure is a 'const char *'.
The spi_new_device() function fills in the modalias value from a passed in
spi_board_info data block. Since it is a pointer copy, the new spi_device
remains dependent on the spi_board_info structure after the new spi_device
is registered (no other fields in spi_device directly depend on the
spi_board_info structure; all of the other data is copied).
This causes a problem when dynamically propulating the list of attached
SPI devices. For example, in arch/powerpc, the list of SPI devices can be
populated from data in the device tree. With the current code, the device
tree adapter must kmalloc() a new spi_board_info structure for each new
SPI device it finds in the device tree, and there is no simple mechanism
in place for keeping track of these allocations.
This patch changes modalias from a 'const char *' to a fixed char array.
By copying the modalias string instead of referencing it, the dependency
on the spi_board_info structure is eliminated and an outside caller does
not need to maintain a separate spi_board_info allocation for each device.
If searched through the code to the best of my ability for any references
to modalias which may be affected by this change and haven't found
anything. It has been tested with the lite5200b platform in arch/powerpc.
[dbrownell@users.sourceforge.net: cope with linux-next changes: KOBJ_NAME_LEN obliterated, etc]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds O_NONBLOCK support to pipe2. It is minimally more involved
than the patches for eventfd et.al but still trivial. The interfaces of the
create_write_pipe and create_read_pipe helper functions were changed and the
one other caller as well.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_pipe2
# ifdef __x86_64__
# define __NR_pipe2 293
# elif defined __i386__
# define __NR_pipe2 331
# else
# error "need __NR_pipe2"
# endif
#endif
int
main (void)
{
int fds[2];
if (syscall (__NR_pipe2, fds, 0) == -1)
{
puts ("pipe2(0) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
int fl = fcntl (fds[i], F_GETFL);
if (fl == -1)
{
puts ("fcntl failed");
return 1;
}
if (fl & O_NONBLOCK)
{
printf ("pipe2(0) set non-blocking mode for fds[%d]\n", i);
return 1;
}
close (fds[i]);
}
if (syscall (__NR_pipe2, fds, O_NONBLOCK) == -1)
{
puts ("pipe2(O_NONBLOCK) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
int fl = fcntl (fds[i], F_GETFL);
if (fl == -1)
{
puts ("fcntl failed");
return 1;
}
if ((fl & O_NONBLOCK) == 0)
{
printf ("pipe2(O_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i);
return 1;
}
close (fds[i]);
}
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch introduces the new syscall inotify_init1 (note: the 1 stands for
the one parameter the syscall takes, as opposed to no parameter before). The
values accepted for this parameter are function-specific and defined in the
inotify.h header. Here the values must match the O_* flags, though. In this
patch CLOEXEC support is introduced.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_inotify_init1
# ifdef __x86_64__
# define __NR_inotify_init1 294
# elif defined __i386__
# define __NR_inotify_init1 332
# else
# error "need __NR_inotify_init1"
# endif
#endif
#define IN_CLOEXEC O_CLOEXEC
int
main (void)
{
int fd;
fd = syscall (__NR_inotify_init1, 0);
if (fd == -1)
{
puts ("inotify_init1(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("inotify_init1(0) set close-on-exit");
return 1;
}
close (fd);
fd = syscall (__NR_inotify_init1, IN_CLOEXEC);
if (fd == -1)
{
puts ("inotify_init1(IN_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("inotify_init1(O_CLOEXEC) does not set close-on-exit");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch introduces the new syscall pipe2 which is like pipe but it also
takes an additional parameter which takes a flag value. This patch implements
the handling of O_CLOEXEC for the flag. I did not add support for the new
syscall for the architectures which have a special sys_pipe implementation. I
think the maintainers of those archs have the chance to go with the unified
implementation but that's up to them.
The implementation introduces do_pipe_flags. I did that instead of changing
all callers of do_pipe because some of the callers are written in assembler.
I would probably screw up changing the assembly code. To avoid breaking code
do_pipe is now a small wrapper around do_pipe_flags. Once all callers are
changed over to do_pipe_flags the old do_pipe function can be removed.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_pipe2
# ifdef __x86_64__
# define __NR_pipe2 293
# elif defined __i386__
# define __NR_pipe2 331
# else
# error "need __NR_pipe2"
# endif
#endif
int
main (void)
{
int fd[2];
if (syscall (__NR_pipe2, fd, 0) != 0)
{
puts ("pipe2(0) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
int coe = fcntl (fd[i], F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
printf ("pipe2(0) set close-on-exit for fd[%d]\n", i);
return 1;
}
}
close (fd[0]);
close (fd[1]);
if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0)
{
puts ("pipe2(O_CLOEXEC) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
int coe = fcntl (fd[i], F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i);
return 1;
}
}
close (fd[0]);
close (fd[1]);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the new dup3 syscall. It extends the old dup2 syscall by one
parameter which is meant to hold a flag value. Support for the O_CLOEXEC flag
is added in this patch.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_dup3
# ifdef __x86_64__
# define __NR_dup3 292
# elif defined __i386__
# define __NR_dup3 330
# else
# error "need __NR_dup3"
# endif
#endif
int
main (void)
{
int fd = syscall (__NR_dup3, 1, 4, 0);
if (fd == -1)
{
puts ("dup3(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("dup3(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC);
if (fd == -1)
{
puts ("dup3(O_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("dup3(O_CLOEXEC) set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the new epoll_create2 syscall. It extends the old epoll_create
syscall by one parameter which is meant to hold a flag value. In this
patch the only flag support is EPOLL_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.
A new name EPOLL_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_epoll_create2
# ifdef __x86_64__
# define __NR_epoll_create2 291
# elif defined __i386__
# define __NR_epoll_create2 329
# else
# error "need __NR_epoll_create2"
# endif
#endif
#define EPOLL_CLOEXEC O_CLOEXEC
int
main (void)
{
int fd = syscall (__NR_epoll_create2, 1, 0);
if (fd == -1)
{
puts ("epoll_create2(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("epoll_create2(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_epoll_create2, 1, EPOLL_CLOEXEC);
if (fd == -1)
{
puts ("epoll_create2(EPOLL_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The timerfd_create syscall already has a flags parameter. It just is
unused so far. This patch changes this by introducing the TFD_CLOEXEC
flag to set the close-on-exec flag for the returned file descriptor.
A new name TFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_timerfd_create
# ifdef __x86_64__
# define __NR_timerfd_create 283
# elif defined __i386__
# define __NR_timerfd_create 322
# else
# error "need __NR_timerfd_create"
# endif
#endif
#define TFD_CLOEXEC O_CLOEXEC
int
main (void)
{
int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0);
if (fd == -1)
{
puts ("timerfd_create(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("timerfd_create(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_CLOEXEC);
if (fd == -1)
{
puts ("timerfd_create(TFD_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("timerfd_create(TFD_CLOEXEC) set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the new eventfd2 syscall. It extends the old eventfd
syscall by one parameter which is meant to hold a flag value. In this
patch the only flag support is EFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.
A new name EFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_eventfd2
# ifdef __x86_64__
# define __NR_eventfd2 290
# elif defined __i386__
# define __NR_eventfd2 328
# else
# error "need __NR_eventfd2"
# endif
#endif
#define EFD_CLOEXEC O_CLOEXEC
int
main (void)
{
int fd = syscall (__NR_eventfd2, 1, 0);
if (fd == -1)
{
puts ("eventfd2(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("eventfd2(0) sets close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_eventfd2, 1, EFD_CLOEXEC);
if (fd == -1)
{
puts ("eventfd2(EFD_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("eventfd2(EFD_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds the new signalfd4 syscall. It extends the old signalfd
syscall by one parameter which is meant to hold a flag value. In this
patch the only flag support is SFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.
A new name SFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_signalfd4
# ifdef __x86_64__
# define __NR_signalfd4 289
# elif defined __i386__
# define __NR_signalfd4 327
# else
# error "need __NR_signalfd4"
# endif
#endif
#define SFD_CLOEXEC O_CLOEXEC
int
main (void)
{
sigset_t ss;
sigemptyset (&ss);
sigaddset (&ss, SIGUSR1);
int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0);
if (fd == -1)
{
puts ("signalfd4(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("signalfd4(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC);
if (fd == -1)
{
puts ("signalfd4(SFD_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch just extends the anon_inode_getfd interface to take an additional
parameter with a flag value. The flag value is passed on to
get_unused_fd_flags in anticipation for a use with the O_CLOEXEC flag.
No actual semantic changes here, the changed callers all pass 0 for now.
[akpm@linux-foundation.org: KVM fix]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some platforms do not have support to restore the signal mask in the
return path from a syscall. For those platforms syscalls like pselect are
not defined at all. This is, I think, not a good choice for paccept()
since paccept() adds more value on top of accept() than just the signal
mask handling.
Therefore this patch defines a scaled down version of the sys_paccept
function for those platforms. It returns -EINVAL in case the signal mask
is non-NULL but behaves the same otherwise.
Note that I explicitly included <linux/thread_info.h>. I saw that it is
currently included but indirectly two levels down. There is too much risk
in relying on this. The header might change and then suddenly the
function definition would change without anyone immediately noticing.
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch is by far the most complex in the series. It adds a new syscall
paccept. This syscall differs from accept in that it adds (at the userlevel)
two additional parameters:
- a signal mask
- a flags value
The flags parameter can be used to set flag like SOCK_CLOEXEC. This is
imlpemented here as well. Some people argued that this is a property which
should be inherited from the file desriptor for the server but this is against
POSIX. Additionally, we really want the signal mask parameter as well
(similar to pselect, ppoll, etc). So an interface change in inevitable.
The flag value is the same as for socket and socketpair. I think diverging
here will only create confusion. Similar to the filesystem interfaces where
the use of the O_* constants differs, it is acceptable here.
The signal mask is handled as for pselect etc. The mask is temporarily
installed for the thread and removed before the call returns. I modeled the
code after pselect. If there is a problem it's likely also in pselect.
For architectures which use socketcall I maintained this interface instead of
adding a system call. The symmetry shouldn't be broken.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <netinet/in.h>
#include <sys/socket.h>
#include <sys/syscall.h>
#ifndef __NR_paccept
# ifdef __x86_64__
# define __NR_paccept 288
# elif defined __i386__
# define SYS_PACCEPT 18
# define USE_SOCKETCALL 1
# else
# error "need __NR_paccept"
# endif
#endif
#ifdef USE_SOCKETCALL
# define paccept(fd, addr, addrlen, mask, flags) \
({ long args[6] = { \
(long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \
syscall (__NR_socketcall, SYS_PACCEPT, args); })
#else
# define paccept(fd, addr, addrlen, mask, flags) \
syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags)
#endif
#define PORT 57392
#define SOCK_CLOEXEC O_CLOEXEC
static pthread_barrier_t b;
static void *
tf (void *arg)
{
pthread_barrier_wait (&b);
int s = socket (AF_INET, SOCK_STREAM, 0);
struct sockaddr_in sin;
sin.sin_family = AF_INET;
sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
sin.sin_port = htons (PORT);
connect (s, (const struct sockaddr *) &sin, sizeof (sin));
close (s);
pthread_barrier_wait (&b);
s = socket (AF_INET, SOCK_STREAM, 0);
sin.sin_port = htons (PORT);
connect (s, (const struct sockaddr *) &sin, sizeof (sin));
close (s);
pthread_barrier_wait (&b);
pthread_barrier_wait (&b);
sleep (2);
pthread_kill ((pthread_t) arg, SIGUSR1);
return NULL;
}
static void
handler (int s)
{
}
int
main (void)
{
pthread_barrier_init (&b, NULL, 2);
struct sockaddr_in sin;
pthread_t th;
if (pthread_create (&th, NULL, tf, (void *) pthread_self ()) != 0)
{
puts ("pthread_create failed");
return 1;
}
int s = socket (AF_INET, SOCK_STREAM, 0);
int reuse = 1;
setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse));
sin.sin_family = AF_INET;
sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
sin.sin_port = htons (PORT);
bind (s, (struct sockaddr *) &sin, sizeof (sin));
listen (s, SOMAXCONN);
pthread_barrier_wait (&b);
int s2 = paccept (s, NULL, 0, NULL, 0);
if (s2 < 0)
{
puts ("paccept(0) failed");
return 1;
}
int coe = fcntl (s2, F_GETFD);
if (coe & FD_CLOEXEC)
{
puts ("paccept(0) set close-on-exec-flag");
return 1;
}
close (s2);
pthread_barrier_wait (&b);
s2 = paccept (s, NULL, 0, NULL, SOCK_CLOEXEC);
if (s2 < 0)
{
puts ("paccept(SOCK_CLOEXEC) failed");
return 1;
}
coe = fcntl (s2, F_GETFD);
if ((coe & FD_CLOEXEC) == 0)
{
puts ("paccept(SOCK_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (s2);
pthread_barrier_wait (&b);
struct sigaction sa;
sa.sa_handler = handler;
sa.sa_flags = 0;
sigemptyset (&sa.sa_mask);
sigaction (SIGUSR1, &sa, NULL);
sigset_t ss;
pthread_sigmask (SIG_SETMASK, NULL, &ss);
sigaddset (&ss, SIGUSR1);
pthread_sigmask (SIG_SETMASK, &ss, NULL);
sigdelset (&ss, SIGUSR1);
alarm (4);
pthread_barrier_wait (&b);
errno = 0 ;
s2 = paccept (s, NULL, 0, &ss, 0);
if (s2 != -1 || errno != EINTR)
{
puts ("paccept did not fail with EINTR");
return 1;
}
close (s);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: make it compile]
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Roland McGrath <roland@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds support for flag values which are ORed to the type passwd
to socket and socketpair. The additional code is minimal. The flag
values in this implementation can and must match the O_* flags. This
avoids overhead in the conversion.
The internal functions sock_alloc_fd and sock_map_fd get a new parameters
and all callers are changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <netinet/in.h>
#include <sys/socket.h>
#define PORT 57392
/* For Linux these must be the same. */
#define SOCK_CLOEXEC O_CLOEXEC
int
main (void)
{
int fd;
fd = socket (PF_INET, SOCK_STREAM, 0);
if (fd == -1)
{
puts ("socket(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("socket(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = socket (PF_INET, SOCK_STREAM|SOCK_CLOEXEC, 0);
if (fd == -1)
{
puts ("socket(SOCK_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("socket(SOCK_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (fd);
int fds[2];
if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1)
{
puts ("socketpair(0) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
coe = fcntl (fds[i], F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
printf ("socketpair(0) set close-on-exec flag for fds[%d]\n", i);
return 1;
}
close (fds[i]);
}
if (socketpair (PF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, fds) == -1)
{
puts ("socketpair(SOCK_CLOEXEC) failed");
return 1;
}
for (int i = 0; i < 2; ++i)
{
coe = fcntl (fds[i], F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
printf ("socketpair(SOCK_CLOEXEC) does not set close-on-exec flag for fds[%d]\n", i);
return 1;
}
close (fds[i]);
}
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Trying to compile the v850 port brings many compile errors, one of them exists
since at least kernel 2.6.19.
There also seems to be noone willing to bring this port back into a usable
state.
This patch therefore removes the v850 port.
If anyone ever decides to revive the v850 port the code will still be
available from older kernels, and it wouldn't be impossible for the port to
reenter the kernel if it would become actively maintained again.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Make some variables and functions static, since they don't need to be
global.
- Remove an unused function - arch/um/kernel/time.c::sched_clock().
- Clean the style a bit as complained by checkpatch.pl.
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Remove arch_validate(), because no one uses it.
- Remove useless macro HAVE_ARCH_VALIDATE.
- Make the variable 'empty_bad_page' static.
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
global_flush_tlb is declared but never used.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Cc: Mikael Starvik <starvik@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mn10300 was the only architecture where sg_dma_{address,len}() were not
in asm/scatterlist.h, and it's not a big surprise that this caused a
compile error somewhere:
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/media/video/videobuf-dma-sg.c: In function `videobuf_dma_map':
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/media/video/videobuf-dma-sg.c:238: error: implicit declaration of function 'sg_dma_address'
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ACPI defines a hardware signature. BIOS calculates the signature according to
hardware configure and if hardware changes while hibernated, the signature
will change. In that case, S4 resume should fail.
Still, there may be systems on which this mechanism does not work correctly,
so it is better to provide a workaround for them. For this reason, add a new
switch to the acpi_sleep= command line argument allowing one to disable
hardware signature checking.
[shaohua.li@intel.com: build fix]
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <Valdis.Kletnieks@vt.edu>
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This interface allows adding a job on a specific cpu.
Although a work struct on a cpu will be scheduled to other cpu if the cpu
dies, there is a recursion if a work task tries to offline the cpu it's
running on. we need to schedule the task to a specific cpu in this case.
http://bugzilla.kernel.org/show_bug.cgi?id=10897
[oleg@tv-sign.ru: cleanups]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Rus <harbour@sfinx.od.ua>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch (as1112) adds some new PM_EVENT_* codes for use by kernel
subsystems. They describe runtime power-state transitions of the sort already
implemented by the USB subsystem.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Drop unnecessary includes from include/linux/pm.h .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the remaining obsolete definitions from include/linux/pm.h and move
the definitions of PM_SUSPEND and PM_RESUME to the header of h3600 which
is the only user of them.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the definition of 'struct pm_dev', which is not used any more,
along with some related stuff from include/linux/pm.h .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the obsolete and no longer used include/linux/pm_legacy.h
Reviewed-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Pavel Machek <pavel@suse.cz>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes the unused include/asm-h8300/keyboard.h
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Why would linux/security.h need forward declarations for nfsctl_arg and
swap_info_struct? It's hard to imagine: remove them.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When cap_bset suppresses some of the forced (fP) capabilities of a file,
it is generally only safe to execute the program if it understands how to
recognize it doesn't have enough privilege to work correctly. For legacy
applications (fE!=0), which have no non-destructive way to determine that
they are missing privilege, we fail to execute (EPERM) any executable that
requires fP capabilities, but would otherwise get pP' < fP. This is a
fail-safe permission check.
For some discussion of why it is problematic for (legacy) privileged
applications to run with less than the set of capabilities requested for
them, see:
http://userweb.kernel.org/~morgan/sendmail-capabilities-war-story.html
With this iteration of this support, we do not include setuid-0 based
privilege protection from the bounding set. That is, the admin can still
(ab)use the bounding set to suppress the privileges of a setuid-0 program.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We'd like to support CONFIG_MEMORY_HOTREMOVE on s390, which depends on
CONFIG_MIGRATION. So far, CONFIG_MIGRATION is only available with NUMA
support.
This patch makes CONFIG_MIGRATION selectable for architectures that define
ARCH_ENABLE_MEMORY_HOTREMOVE. When MIGRATION is enabled w/o NUMA, the
kernel won't compile because migrate_vmas() does not know about
vm_ops->migrate() and vma_migratable() does not know about policy_zone.
To fix this, those two functions can be restricted to '#ifdef CONFIG_NUMA'
because they are not being used w/o NUMA. vma_migratable() is moved over
from migrate.h to mempolicy.h.
[kosaki.motohiro@jp.fujitsu.com: build fix]
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: KOSAKI Motorhiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
While in all cases in the kernel we know the size of the elements to be
created, we don't always know the count of elements. By commuting the size
and count in the overflow check, the compiler can reduce the runtime division
of size_t with a compare to a (unique) constant in these cases.
Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Memory may be hot-removed on a per-memory-block basis, particularly on
POWER where the SPARSEMEM section size often matches the memory-block
size. A user-level agent must be able to identify which sections of
memory are likely to be removable before attempting the potentially
expensive operation. This patch adds a file called "removable" to the
memory directory in sysfs to help such an agent. In this patch, a memory
block is considered removable if;
o It contains only MOVABLE pageblocks
o It contains only pageblocks with free pages regardless of pageblock type
On the other hand, a memory block starting with a PageReserved() page will
never be considered removable. Without this patch, the user-agent is
forced to choose a memory block to remove randomly.
Sample output of the sysfs files:
./memory/memory0/removable: 0
./memory/memory1/removable: 0
./memory/memory2/removable: 0
./memory/memory3/removable: 0
./memory/memory4/removable: 0
./memory/memory5/removable: 0
./memory/memory6/removable: 0
./memory/memory7/removable: 1
./memory/memory8/removable: 0
./memory/memory9/removable: 0
./memory/memory10/removable: 0
./memory/memory11/removable: 0
./memory/memory12/removable: 0
./memory/memory13/removable: 0
./memory/memory14/removable: 0
./memory/memory15/removable: 0
./memory/memory16/removable: 0
./memory/memory17/removable: 1
./memory/memory18/removable: 1
./memory/memory19/removable: 1
./memory/memory20/removable: 1
./memory/memory21/removable: 1
./memory/memory22/removable: 1
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
boundary. For example:
u64 val = PAGE_ALIGN(size);
always returns a value < 4GB even if size is greater than 4GB.
The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
example):
#define PAGE_SHIFT 12
#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK (~(PAGE_SIZE-1))
...
#define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK)
The "~" is performed on a 32-bit value, so everything in "and" with
PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
Using the ALIGN() macro seems to be the right way, because it uses
typeof(addr) for the mask.
Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
include/linux/mm.h.
See also lkml discussion: http://lkml.org/lkml/2008/6/11/237
[akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
[akpm@linux-foundation.org: fix v850]
[akpm@linux-foundation.org: fix powerpc]
[akpm@linux-foundation.org: fix arm]
[akpm@linux-foundation.org: fix mips]
[akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
[akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
[akpm@linux-foundation.org: fix powerpc]
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
alloc_pages_exact() is similar to alloc_pages(), except that it allocates
the minimum number of pages to fulfill the request. This is useful if you
want to allocate a very large buffer that is slightly larger than an even
power-of-two number of pages. In that case, alloc_pages() will waste a
lot of memory.
I have a video driver that wants to allocate a 5MB buffer. alloc_pages()
wiill waste 3MB of physically-contiguous memory.
Signed-off-by: Timur Tabi <timur@freescale.com>
Cc: Andi Kleen <andi@firstfloor.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Almost all users of this field need a PFN instead of a physical address,
so replace node_boot_start with node_min_pfn.
[Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()]
Signed-off-by: Johannes Weiner <hannes@saeureba.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
alloc_bootmem_core has become quite nasty to read over time. This is a
clean rewrite that keeps the semantics.
bdata->last_pos has been dropped.
bdata->last_success has been renamed to hint_idx and it is now an index
relative to the node's range. Since further block searching might start
at this index, it is now set to the end of a succeeded allocation rather
than its beginning.
bdata->last_offset has been renamed to last_end_off to be more clear that
it represents the ending address of the last allocation relative to the
node.
[y-goto@jp.fujitsu.com: fix new alloc_bootmem_core()]
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This only reorders functions so that further patches will be easier to
read. No code changed.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of using the variable mmu_huge_psize to keep track of the huge
page size we use an array of MMU_PAGE_* values. For each supported huge
page size we need to know the hugepte_shift value and have a
pgtable_cache. The hstate or an mmu_huge_psizes index is passed to
functions so that they know which huge page size they should use.
The hugepage sizes 16M and 64K are setup(if available on the hardware) so
that they don't have to be set on the boot cmd line in order to use them.
The number of 16G pages have to be specified at boot-time though (e.g.
hugepagesz=16G hugepages=5).
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The huge page size is defined for 16G pages. If a hugepagesz of 16G is
specified at boot-time then it becomes the huge page size instead of the
default 16M.
The change in pgtable-64K.h is to the macro pte_iterate_hashed_subpages to
make the increment to va (the 1 being shifted) be a long so that it is not
shifted to 0. Otherwise it would create an infinite loop when the shift
value is for a 16G page (when base page size is 64K).
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The 16G huge pages have to be reserved in the HMC prior to boot. The
location of the pages are placed in the device tree. This patch adds code
to scan the device tree during very early boot and save these page
locations until hugetlbfs is ready for them.
Acked-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow alloc_bootmem_huge_page() to be overridden by architectures that
can't always use bootmem. This requires huge_boot_pages to be available
for use by this function.
This is required for powerpc 16G pages, which have to be reserved prior to
boot-time. The location of these pages are indicated in the device tree.
Acked-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add an hugepagesz=... option similar to IA64, PPC etc. to x86-64.
This finally allows to select GB pages for hugetlbfs in x86 now that all
the infrastructure is in place.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Straight forward extensions for huge pages located in the PUD instead of
PMDs.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Straight forward variant of the existing __alloc_bootmem_node, only
subsequent patch when allocating giant hugepages at boot -- don't want to
panic if we can't allocate as many as the user asked for.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Provide new hugepages user APIs that are more suited to multiple hstates
in sysfs. There is a new directory, /sys/kernel/hugepages. Underneath
that directory there will be a directory per-supported hugepage size,
e.g.:
/sys/kernel/hugepages/hugepages-64kB
/sys/kernel/hugepages/hugepages-16384kB
/sys/kernel/hugepages/hugepages-16777216kB
corresponding to 64k, 16m and 16g respectively. Within each
hugepages-size directory there are a number of files, corresponding to the
tracked counters in the hstate, e.g.:
/sys/kernel/hugepages/hugepages-64/nr_hugepages
/sys/kernel/hugepages/hugepages-64/nr_overcommit_hugepages
/sys/kernel/hugepages/hugepages-64/free_hugepages
/sys/kernel/hugepages/hugepages-64/resv_hugepages
/sys/kernel/hugepages/hugepages-64/surplus_hugepages
Of these files, the first two are read-write and the latter three are
read-only. The size of the hugepage being manipulated is trivially
deducible from the enclosing directory and is always expressed in kB (to
match meminfo).
[dave@linux.vnet.ibm.com: fix build]
[nacc@us.ibm.com: hugetlb: hang off of /sys/kernel/mm rather than /sys/kernel]
[nacc@us.ibm.com: hugetlb: remove CONFIG_SYSFS dependency]
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add the ability to configure the hugetlb hstate used on a per mount basis.
- Add a new pagesize= option to the hugetlbfs mount that allows setting
the page size
- This option causes the mount code to find the hstate corresponding to the
specified size, and sets up a pointer to the hstate in the mount's
superblock.
- Change the hstate accessors to use this information rather than the
global_hstate they were using (requires a slight change in mm/memory.c
so we don't NULL deref in the error-unmap path -- see comments).
[np: take hstate out of hugetlbfs inode and vma->vm_private_data]
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add basic support for more than one hstate in hugetlbfs. This is the key
to supporting multiple hugetlbfs page sizes at once.
- Rather than a single hstate, we now have an array, with an iterator
- default_hstate continues to be the struct hstate which we use by default
- Add functions for architectures to register new hstates
[akpm@linux-foundation.org: coding-style fixes]
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The goal of this patchset is to support multiple hugetlb page sizes. This
is achieved by introducing a new struct hstate structure, which
encapsulates the important hugetlb state and constants (eg. huge page
size, number of huge pages currently allocated, etc).
The hstate structure is then passed around the code which requires these
fields, they will do the right thing regardless of the exact hstate they
are operating on.
This patch adds the hstate structure, with a single global instance of it
(default_hstate), and does the basic work of converting hugetlb to use the
hstate.
Future patches will add more hstate structures to allow for different
hugetlbfs mounts to have different page sizes.
[akpm@linux-foundation.org: coding-style fixes]
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add a kobject to create /sys/kernel/mm when sysfs is mounted. The kobject
will exist regardless. This will allow for the hugepage related sysfs
directories to exist under the mm "subsystem" directory. Add an ABI file
appropriately.
[kosaki.motohiro@jp.fujitsu.com: fix build]
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With Mel's hugetlb private reservation support patches applied, strict
overcommit semantics are applied to both shared and private huge page
mappings. This can be a problem if an application relied on unlimited
overcommit semantics for private mappings. An example of this would be an
application which maps a huge area with the intention of using it very
sparsely. These application would benefit from being able to opt-out of
the strict overcommit. It should be noted that prior to hugetlb
supporting demand faulting all mappings were fully populated and so
applications of this type should be rare.
This patch stack implements the MAP_NORESERVE mmap() flag for huge page
mappings. This flag has the same meaning as for small page mappings,
suppressing reservations for that mapping.
Thanks to Mel Gorman for reviewing a number of early versions of these
patches.
This patch:
When a small page mapping is created with mmap() reservations are created
by default for any memory pages required. When the region is read/write
the reservation is increased for every page, no reservation is needed for
read-only regions (as they implicitly share the zero page). Reservations
are tracked via the VM_ACCOUNT vma flag which is present when the region
has reservation backing it. When we convert a region from read-only to
read-write new reservations are aquired and VM_ACCOUNT is set. However,
when a read-only map is created with MAP_NORESERVE it is indistinguishable
from a normal mapping. When we then convert that to read/write we are
forced to incorrectly create reservations for it as we have no record of
the original MAP_NORESERVE.
This patch introduces a new vma flag VM_NORESERVE which records the
presence of the original MAP_NORESERVE flag. This allows us to
distinguish these two circumstances and correctly account the reserve.
As well as fixing this FIXME in the code, this makes it much easier to
introduce MAP_NORESERVE support for huge pages as this flag is available
consistantly for the life of the mapping. VM_ACCOUNT on the other hand is
heavily used at the generic level in association with small pages.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Johannes Weiner <hannes@saeurebad.de>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After patch 2 in this series, a process that successfully calls mmap() for
a MAP_PRIVATE mapping will be guaranteed to successfully fault until a
process calls fork(). At that point, the next write fault from the parent
could fail due to COW if the child still has a reference.
We only reserve pages for the parent but a copy must be made to avoid
leaking data from the parent to the child after fork(). Reserves could be
taken for both parent and child at fork time to guarantee faults but if
the mapping is large it is highly likely we will not have sufficient pages
for the reservation, and it is common to fork only to exec() immediatly
after. A failure here would be very undesirable.
Note that the current behaviour of mainline with MAP_PRIVATE pages is
pretty bad. The following situation is allowed to occur today.
1. Process calls mmap(MAP_PRIVATE)
2. Process calls mlock() to fault all pages and makes sure it succeeds
3. Process forks()
4. Process writes to MAP_PRIVATE mapping while child still exists
5. If the COW fails at this point, the process gets SIGKILLed even though it
had taken care to ensure the pages existed
This patch improves the situation by guaranteeing the reliability of the
process that successfully calls mmap(). When the parent performs COW, it
will try to satisfy the allocation without using reserves. If that fails
the parent will steal the page leaving any children without a page.
Faults from the child after that point will result in failure. If the
child COW happens first, an attempt will be made to allocate the page
without reserves and the child will get SIGKILLed on failure.
To summarise the new behaviour:
1. If the original mapper performs COW on a private mapping with multiple
references, it will attempt to allocate a hugepage from the pool or
the buddy allocator without using the existing reserves. On fail, VMAs
mapping the same area are traversed and the page being COW'd is unmapped
where found. It will then steal the original page as the last mapper in
the normal way.
2. The VMAs the pages were unmapped from are flagged to note that pages
with data no longer exist. Future no-page faults on those VMAs will
terminate the process as otherwise it would appear that data was corrupted.
A warning is printed to the console that this situation occured.
2. If the child performs COW first, it will attempt to satisfy the COW
from the pool if there are enough pages or via the buddy allocator if
overcommit is allowed and the buddy allocator can satisfy the request. If
it fails, the child will be killed.
If the pool is large enough, existing applications will not notice that
the reserves were a factor. Existing applications depending on the
no-reserves been set are unlikely to exist as for much of the history of
hugetlbfs, pages were prefaulted at mmap(), allocating the pages at that
point or failing the mmap().
[npiggin@suse.de: fix CONFIG_HUGETLB=n build]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch reserves huge pages at mmap() time for MAP_PRIVATE mappings in
a similar manner to the reservations taken for MAP_SHARED mappings. The
reserve count is accounted both globally and on a per-VMA basis for
private mappings. This guarantees that a process that successfully calls
mmap() will successfully fault all pages in the future unless fork() is
called.
The characteristics of private mappings of hugetlbfs files behaviour after
this patch are;
1. The process calling mmap() is guaranteed to succeed all future faults until
it forks().
2. On fork(), the parent may die due to SIGKILL on writes to the private
mapping if enough pages are not available for the COW. For reasonably
reliable behaviour in the face of a small huge page pool, children of
hugepage-aware processes should not reference the mappings; such as
might occur when fork()ing to exec().
3. On fork(), the child VMAs inherit no reserves. Reads on pages already
faulted by the parent will succeed. Successful writes will depend on enough
huge pages being free in the pool.
4. Quotas of the hugetlbfs mount are checked at reserve time for the mapper
and at fault time otherwise.
Before this patch, all reads or writes in the child potentially needs page
allocations that can later lead to the death of the parent. This applies
to reads and writes of uninstantiated pages as well as COW. After the
patch it is only a write to an instantiated page that causes problems.
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
free_area_init_node() gets passed in the node id as well as the node
descriptor. This is redundant as the function can trivially get the node
descriptor itself by means of NODE_DATA() and the node's id.
I checked all the users and NODE_DATA() seems to be usable everywhere
from where this function is called.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is called on a per-page basis and in the vast majority of cases
`error' is zero.
Cc: Guillaume Chazarain <guichaz@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLOB reuses two page bits for internal purposes, it overlays PG_active and
PG_private. This is hidden away in slob.c. Document these overlays
explicitly in the main page-flags enum along with all the others.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLUB reuses two page bits for internal purposes, it overlays PG_active and
PG_error. This is hidden away in slub.c. Document these overlays
explicitly in the main page-flags enum along with all the others.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the recent page flag reorganisation we have a single enum which
defines the valid page flags and their values, nice and clear. However
there are a number of bits which are overloaded by different subsystems.
Firstly there is PG_owner_priv_1 which is used by filesystems and by XEN.
Secondly both SLOB and SLUB use a couple of extra page bits to manage
internal state for pages they own; both overlay other bits. All of these
"aliases" are scattered about the source making it very hard for a reader
to know if the bits are safe to rely on in all contexts; confusion here is
bad.
As we now have a single place where the bits are clearly assigned it makes
sense to clarify the reuse of bits by making the aliases explicit and
visible with the original bit assignments. This patch creates explicit
aliases within the enum itself for the overloaded bits, creates standard
bit accessors PageFoo etc. and uses those throughout.
This version pulls the bit manipulation out to standard named page bit
accessors as suggested by Christoph, it retains the explicit mapping to
the overlayed bits. A fusion of both ideas. This has been SLUB and SLOB
have been compile tested on x86_64 only, and SLUB boot tested. If people
feel this is worth doing then I can run a fuller set of testing.
This patch:
Some page flags are used for more than one purpose, for example
PG_owner_priv_1. Currently there are individual accessors for each user,
each built using the common flag name far away from the bit definitions.
This makes it hard to see all possible uses of these bits.
Now that we have a single enum to generate the bit orders it makes sense
to express overlays in the same place. So create per use aliases for this
bit in the main page-flags enum and use those in the accessors.
[akpm@linux-foundation.org: fix xen]
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[Summary]
Split LRU-list of unused dentries to one per superblock to avoid soft
lock up during NFS mounts and remounting of any filesystem.
Previously I posted here:
http://lkml.org/lkml/2008/3/5/590
[Descriptions]
- background
dentry_unused is a list of dentries which are not referenced.
dentry_unused grows up when references on directories or files are
released. This list can be very long if there is huge free memory.
- the problem
When shrink_dcache_sb() is called, it scans all dentry_unused linearly
under spin_lock(), and if dentry->d_sb is differnt from given
superblock, scan next dentry. This scan costs very much if there are
many entries, and very ineffective if there are many superblocks.
IOW, When we need to shrink unused dentries on one dentry, but scans
unused dentries on all superblocks in the system. For example, we scan
500 dentries to unmount a filesystem, but scans 1,000,000 or more unused
dentries on other superblocks.
In our case , At mounting NFS*, shrink_dcache_sb() is called to shrink
unused dentries on NFS, but scans 100,000,000 unused dentries on
superblocks in the system such as local ext3 filesystems. I hear NFS
mounting took 1 min on some system in use.
* : NFS uses virtual filesystem in rpc layer, so NFS is affected by
this problem.
100,000,000 is possible number on large systems.
Per-superblock LRU of unused dentried can reduce the cost in
reasonable manner.
- How to fix
I found this problem is solved by David Chinner's "Per-superblock
unused dentry LRU lists V3"(1), so I rebase it and add some fix to
reclaim with fairness, which is in Andrew Morton's comments(2).
1) http://lkml.org/lkml/2006/5/25/318
2) http://lkml.org/lkml/2006/5/25/320
Split LRU-list of unused dentries to each superblocks. Then, NFS
mounting will check dentries under a superblock instead of all. But
this spliting will break LRU of dentry-unused. So, I've attempted to
make reclaim unused dentrins with fairness by calculate number of
dentries to scan on this sb based on following way
number of dentries to scan on this sb =
count * (number of dentries on this sb / number of dentries in the machine)
- ToDo
- I have to measuring performance number and do stress tests.
- When unmount occurs during prune_dcache(), scanning on same
superblock, It is unable to reach next superblock because it is gone
away. We restart scannig superblock from first one, it causes
unfairness of reclaim unused dentries on first superblock. But I think
this happens very rarely.
- Test Results
Result on 6GB boxes with excessive unused dentries.
Without patch:
$ cat /proc/sys/fs/dentry-state
10181835 10180203 45 0 0 0
# mount -t nfs 10.124.60.70:/work/kernel-src nfs
real 0m1.830s
user 0m0.001s
sys 0m1.653s
With this patch:
$ cat /proc/sys/fs/dentry-state
10236610 10234751 45 0 0 0
# mount -t nfs 10.124.60.70:/work/kernel-src nfs
real 0m0.106s
user 0m0.002s
sys 0m0.032s
[akpm@linux-foundation.org: fix comments]
Signed-off-by: Kentaro Makita <k-makita@np.css.fujitsu.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: David Chinner <dgc@sgi.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The double indirection here is not needed anywhere and hence (at least)
confusing.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This adds ioremap_prot and pte_pgprot() so that one can extract protection
bits from a PTE and use them to ioremap_prot() (in order to support ptrace
of VM_IO | VM_PFNMAP as per Rik's patch).
This moves a couple of flag checks around in the ioremap implementations
of arch/powerpc. There's a side effect of allowing non-cacheable and
non-guarded mappings on ppc32 which before would always have _PAGE_GUARDED
set whenever _PAGE_NO_CACHE is.
(standard ioremap will still set _PAGE_GUARDED, but ioremap_prot will be
capable of setting such a non guarded mapping).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In order to be able to debug things like the X server and programs using
the PPC Cell SPUs, the debugger needs to be able to access device memory
through ptrace and /proc/pid/mem.
This patch:
Add the generic_access_phys access function and put the hooks in place
to allow access_process_vm to access device or PPC Cell SPU memory.
[riel@redhat.com: Add documentation for the vm_ops->access function]
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Benjamin Herrensmidt <benh@kernel.crashing.org>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are no users of nopfn in the tree. Remove it.
[hugh@veritas.com: fix build error]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch adds proper extern declarations for five variables in
include/linux/vmstat.h
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Two zonelist patch series rewrote __page_alloc() largely. Now, it is just
a wrapper function. Inlining them will save a function call.
[akpm@linux-foundation.org: export __alloc_pages_internal]
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This function has no external callers, so unexport it. Also fix its naming
inconsistency.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There are a lot of places that define either a single bootmem descriptor or an
array of them. Use only one central array with MAX_NUMNODES items instead.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
lib/debugobjects.c has a function to test if an object is on the stack.
The block layer and ide needs it (they need to avoid DMA from/to stack
buffers). This patch moves the function to include/linux/sched.h so that
everyone can use it.
lib/debugobjects.c uses current->stack but this patch uses a
task_stack_page() accessor, which is a preferable way to access the stack.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
James Bottomley warns that inclusion of linux/fs.h in a low level
driver was always a danger signal. This patch moves
memory_read_from_buffer() from fs.h to string.h and fixes includes in
existing memory_read_from_buffer() users.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Bob Moore <robert.moore@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
All users have now been converted to linux/semaphore.h and we don't need
to keep these files around any longer.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
This patch adds platform data to the AC97C platform device. This will
let the board add a GPIO line which is connected to the external codecs
reset line.
The platform data, ac97c_platform_data, must also contain the DMA
controller ID, RX channel ID and TX channel ID.
Tested with Wolfson WM9712 and AP7000.
Signed-off-by: Hans-Christian Egtvedt <hcegtvedt@atmel.com>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Apparently,
commit 6330a30a76
Author: Vegard Nossum <vegard.nossum@gmail.com>
Date: Wed May 28 09:46:19 2008 +0200
x86: break mutual header inclusion
introduced some CONFIG names to processor-flags.h, which was exported in
commit 6093015db2
Author: Ingo Molnar <mingo@elte.hu>
Date: Sun Mar 30 11:45:23 2008 +0200
x86: cleanup replace most vm86 flags with flags from processor-flags.h, fix
Fix it by wrapping the CONFIG parts in __KERNEL__.
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adjust a comment in ack_APIC_irq() according to the recent removal of
CONFIG_X86_GOOD_APIC.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Just out or curiousity ran checkpatch.pl for whole UBI,
and discovered there are quite a few of stylistic issues.
Fix them.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Quite useful ioctl which allows to make atomic system upgrades.
The idea belongs to Richard Titmuss <richard_titmuss@logitech.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Hch asked not to use "unit" for sub-systems, let it be so.
Also some other commentaries modifications.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
* 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: hrtick_enabled() should use cpu_active()
sched, x86: clean up hrtick implementation
sched: fix build error, provide partition_sched_domains() unconditionally
sched: fix warning in inc_rt_tasks() to not declare variable 'rq' if it's not needed
cpu hotplug: Make cpu_active_map synchronization dependency clear
cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)
sched: rework of "prioritize non-migratable tasks over migratable ones"
sched: reduce stack size in isolated_cpu_setup()
Revert parts of "ftrace: do not trace scheduler functions"
Fixed up conflicts in include/asm-x86/thread_info.h (due to the
TIF_SINGLESTEP unification vs TIF_HRTICK_RESCHED removal) and
kernel/sched_fair.c (due to cpu_active_map vs for_each_cpu_mask_nr()
introduction).
* 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
NR_CPUS: Replace NR_CPUS in speedstep-centrino.c
cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP
NR_CPUS: Replace NR_CPUS in cpufreq userspace routines
NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var
NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c
NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c
NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c
NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c
cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix
cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target
cpumask: Provide a generic set of CPUMASK_ALLOC macros
cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c
cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c
cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c
cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c
cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c
cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
Revert "cpumask: introduce new APIs"
cpumask: make for_each_cpu_mask a bit smaller
net: Pass reference to cpumask variable in net/sunrpc/svc.c
...
Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually
* 'core/softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
softlockup: fix invalid proc_handler for softlockup_panic
softlockup: fix watchdog task wakeup frequency
softlockup: fix watchdog task wakeup frequency
softlockup: show irqtrace
softlockup: print a module list on being stuck
softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression
softlockup: fix false positives on nohz if CPU is 100% idle for more than 60 seconds
softlockup: fix softlockup_thresh fix
softlockup: fix softlockup_thresh unaligned access and disable detection at runtime
softlockup: allow panic on lockup
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (85 commits)
[ARM] pxa: add base support for PXA930 Handheld Platform (aka SAAR)
[ARM] pxa: add base support for PXA930 Evaluation Board (aka TavorEVB)
[ARM] pxa: add base support for PXA930 (aka Tavor-P)
[ARM] Update mach-types
[ARM] pxa: make littleton to use the new smc91x platform data
[ARM] pxa: make zylonite to use the new smc91x platform data
[ARM] pxa: make mainstone to use the new smc91x platform data
[ARM] pxa: make lubbock to use new smc91x platform data
[NET] smc91x: prepare SMC_USE_PXA_DMA to be specified in platform data
[NET] smc91x: prepare for SMC_IO_SHIFT to be a platform configurable variable
[NET] smc91x: add SMC91X_NOWAIT flag to platform data
[NET] smc91x: favor the use of SMC91X_USE_* instead of SMC_CAN_USE_*
[NET] smc91x: remove "irq_flags" from "struct smc91x_platdata"
[ARM] 5146/1: pxa2xx: convert all boards to call pxa2xx_transceiver_mode helper
Support for LCD on e740 e750 e400 and e800 e-series PDAs
E-series UDC support
PXA UDC - allow use of inverted GPIO for pullup
Add e350 support
Fix broken e-series build
E-series GPIO / IRQ definitions.
...
The functions in a header should not belong to another module. The prio functions
belong to v4l2-common.c, so move them to v4l2-common.h.
The ioctl functions belong to v4l2-ioctl.c, so create a new v4l2-ioctl.h header
and move those functions to it.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The class_dev field is a normal device, not a class device. This is very
confusing and now that the old 'dev' field has been renamed to 'parent'
we can rename 'class_dev' to just 'dev'.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The field 'dev' is not the video device, but the parent of the video device.
Rename accordingly.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
sdhci: highmem capable PIO routines
sg: reimplement sg mapping iterator
mmc_test: print message when attaching to card
mmc: Remove Russell as primecell mci maintainer
mmc_block: bounce buffer highmem support
sdhci: fix bad warning from commit c8b3e02
sdhci: add warnings for bad buffers in ADMA path
mmc_test: test oversized sg lists
mmc_test: highmem tests
s3cmci: ensure host stopped on machine shutdown
au1xmmc: suspend/resume implementation
s3cmci: fixes for section mismatch warnings
pxamci: trivial fix of DMA alignment register bit clearing
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (24 commits)
I/OAT: I/OAT version 3.0 support
I/OAT: tcp_dma_copybreak default value dependent on I/OAT version
I/OAT: Add watchdog/reset functionality to ioatdma
iop_adma: cleanup iop_chan_xor_slot_count
iop_adma: document how to calculate the minimum descriptor pool size
iop_adma: directly reclaim descriptors on allocation failure
async_tx: make async_tx_test_ack a boolean routine
async_tx: remove depend_tx from async_tx_sync_epilog
async_tx: export async_tx_quiesce
async_tx: fix handling of the "out of descriptor" condition in async_xor
async_tx: ensure the xor destination buffer remains dma-mapped
async_tx: list_for_each_entry_rcu() cleanup
dmaengine: Driver for the Synopsys DesignWare DMA controller
dmaengine: Add slave DMA interface
dmaengine: add DMA_COMPL_SKIP_{SRC,DEST}_UNMAP flags to control dma unmap
dmaengine: Add dma_client parameter to device_alloc_chan_resources
dmatest: Simple DMA memcpy test client
dmaengine: DMA engine driver for Marvell XOR engine
iop-adma: fix platform driver hotplug/coldplug
dmaengine: track the number of clients using a channel
...
Fixed up conflict in drivers/dca/dca-sysfs.c manually
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (60 commits)
ide: small whitespace fixes
ide: ide-cd_ioctl.c fix sparse integer as NULL pointer warnings
ide: ide-cd.c fix sparse endianness warnings
ide-cd: convert to using the new atapi_flags
ide: remove unused PC_FLAG_DRQ_INTERRUPT
ide-scsi: convert to using the new atapi_flags
ide-tape: convert to using the new atapi_flags
ide-floppy: convert to using the new atapi_flags (take 2)
ide: add per-device flags
ide: use rq->cmd instead of pc->c in atapi common code
ide-scsi: pass packet command in rq->cmd
ide-tape: pass packet command in rq->cmd
ide-tape: make room for packet command ids in rq->cmd
ide-floppy: pass packet command in rq->cmd
ide: remove pc->callback member from ide_atapi_pc
ide-scsi: use drive->pc_callback instead of pc->callback
ide-tape: use drive->pc_callback instead of pc->callback
ide-floppy: use drive->pc_callback instead of pc->callback
ide: push pc callback pointer into the ide_drive_t structure
drivers/ide/ide-tape.c: remove double kfree
...
There should be no functionality change resulting from this patch.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
while at it, remove PC_FLAG_ZIP_DRIVE from the packed command flags altogether
and query the drive type through drive->atapi_flags.
v2:
ide-floppy fix.
There should be no functionality change resulting from this patch.
[bart: IDE_FLAG_* -> IDE_AFLAG_*, dev_flags -> atapi_flags]
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Push device flags up into ide_drive_t.
There should be no functionality change resulting from this patch.
[bart: IDE_FLAG_* -> IDE_AFLAG_*, dev_flags -> atapi_flags]
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
There should be no functionality change resulting from this patch.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Refrain from carrying the callback ptr with every packet command since the
callback function is only one anyways. ide_drive_t is probably not the most
suitable place for it right now but is the more sane solution. Besides, these
structs are going to be reorganized anyways during the generic ide rewrite.
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add ide_host_free() helper and convert ide_host_remove() to use it.
* Fix handling of ide_host_register() failure in ide_host_add(),
icside.c, ide-generic.c, falconide.c and sgiioc4.c.
While at it:
* Fix handling of ide_host_alloc_all() failure in ide-generic.c.
* Fix handling of ide_host_alloc() failure in falconide.c
(also return the correct error value if no device is found).
v2:
* falconide build fix. (From Stephen Rothwell)
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ide_host_add() helper which does ide_host_alloc()+ide_host_register(),
then convert ide_setup_pci_device[s](), ide_legacy_device_add() and some
host drivers to use it.
While at it:
* Fix ide_setup_pci_device[s](), ide_arm.c, gayle.c, ide-4drives.c,
macide.c, q40ide.c, cmd640.c and cs5520.c to return correct error value.
* -ENOENT -> -ENOMEM in rapide.c, ide-h8300.c, ide-generic.c, au1xxx-ide.c
and pmac.c
* -ENODEV -> -ENOMEM in palm_bk3710.c, ide_platform.c and delkin_cb.c
* -1 -> -ENOMEM in ide-pnp.c
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add struct ide_host which keeps pointers to host's ports.
* Add ide_host_alloc[_all]() and ide_host_remove() helpers.
* Pass 'struct ide_host *host' instead of 'u8 *idx' to
ide_device_add[_all]() and rename it to ide_host_register[_all]().
* Convert host drivers and core code to use struct ide_host.
* Remove no longer needed ide_find_port().
* Make ide_find_port_slot() static.
* Unexport ide_unregister().
v2:
* Add missing 'struct ide_host *host' to macide.c.
v3:
* Fix build problem in pmac.c (s/ide_alloc_host/ide_host_alloc/)
(Noticed by Stephen Rothwell).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add struct ide_tp_ops for transport methods.
* Add 'const struct ide_tp_ops *tp_ops' to struct ide_port_info
and ide_hwif_t.
* Set the default hwif->tp_ops in ide_init_port_data().
* Set host driver specific hwif->tp_ops in ide_init_port().
* Export ide_exec_command(), ide_read_status(), ide_read_altstatus(),
ide_read_sff_dma_status(), ide_set_irq(), ide_tf_{load,read}()
and ata_{in,out}put_data().
* Convert host drivers and core code to use struct ide_tp_ops.
* Remove no longer needed default_hwif_transport().
* Cleanup ide_hwif_t from methods that are now in struct ide_tp_ops.
While at it:
* Use struct ide_port_info in falconide.c and q40ide.c.
* Rename ata_{in,out}put_data() to ide_{in,out}put_data().
v2:
* Fix missing convertion in ns87415.c.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add 'config' field to hw_regs_t and use it to set hwif->config_data in
ide_init_port_hw(), then convert ide_legacy_init_one() to use hw->config.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Filter out "default" transfer mode values (0x00 - default PIO mode,
0x01 - default PIO mode w/ IORDY disabled) in write handler for obsoleted
/proc/ide/hd?/settings:current_speed setting.
Allowing "default" transfer mode values is a dangerous thing to do as
we don't support programming controller to the "default" transfer mode
and devices often use different values for the default and maximum PIO
mode (i.e. PIO2 default and PIO4 maximum) so the controller will stay
programmed for higher PIO mode while device will use the lower PIO mode.
There is no functionality loss as by using special IOCTLs device can
still be programmed to "default" transfer modes (it is only useful for
debugging/testing purposes anyway).
* Remove no longer needed IDE_HFLAG_ABUSE_SET_DMA_MODE host flag, it was
previously used by few host drivers to program the controller to PIO0
timings for "default" transfer mode == 0x01 (although some host drivers
would program invalid PIO timings instead).
* Cleanup ide_set_xfer_rate() and add BUG_ON().
Suggested-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Lets remove dead Virtual DMA support for now so it doesn't clutter
core IDE code (it can be bring back when there is a need for it):
* Remove IDE_HFLAG_VDMA host flag.
* Remove ide_drive_t.vdma flag.
* cs5520.c: remove stale FIXMEs, cs5520_dma_host_set() and cs5520_dma_ops
(also there is no longer a need to set IDE_HFLAG_NO_ATAPI_DMA).
There should be no functional changes caused by this patch.
Cc: TAKADA Yoshihito <takada@mbf.nifty.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Remove no longer needed ->INB, ->OUTB and ->OUTBSYNC methods.
Then:
* Remove no longer used default_hwif_[mm]iops() and ide_[mm_]outbsync().
* Cleanup SuperIO handling in ns87415.c.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ide_read_bcount_and_ireason() helper and use it instead of ->INB
in {cdrom_newpc,ide_pc}_intr().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add IDE_TFLAG_IN_FEATURE taskfile flag for reading Feature
register and handle it in ->tf_read.
* Convert ide_read_error() to use ->tf_read instead of ->INB,
then uninline and export it.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ->set_irq method for setting nIEN bit of ATA Device Control
register and use it instead of ide_set_irq().
While at it:
* Use ->set_irq in init_irq() and do_reset1().
* Don't use HWIF() macro in ide_check_pm_state().
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Remove ide_read_altstatus() inline helper.
* Add ->read_altstatus method for reading ATA Alternate Status
register and use it instead of ->INB.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Remove ide_read_status() inline helper.
* Add ->read_status method for reading ATA Status register
and use it instead of ->INB.
While at it:
* Don't use HWGROUP() macro.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ->exec_command method for writing ATA Command register
and use it instead of ->OUTBSYNC.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Factor out simplex handling from ide_pci_dma_base() to
ide_pci_check_simplex().
* Set hwif->dma_base early in ->init_dma method / ide_hwif_setup_dma()
and reset it in ide_init_port() if DMA initialization fails.
* Use ->read_sff_dma_status instead of ->INB in ide_pci_dma_base().
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Export sff_dma_ops and then remove ide_setup_dma().
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Use ->dma_base + offset instead of ->dma_{status,command}
and remove no longer needed ->dma_{status,command}.
While at it:
* Use ATA_DMA_* defines.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add ->read_sff_dma_status method for reading DMA Status register
and use it instead of ->INB.
While at it:
* Use inb() directly in ns87415.c::ns87415_dma_end().
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'hw_regs_t **hws' argument to ide_device_add[_all]() and convert
host drivers + ide_legacy_init_one() + ide_setup_pci_device[s]() to use
it instead of calling ide_init_port_hw() directly.
[ However if host has > 1 port we must still set hwif->chipset to hint
consecutive ide_find_port() call that the previous slot is occupied. ]
* Unexport ide_init_port_hw().
v2:
* Use defines instead of hard-coded values in buddha.c, gayle.c and q40ide.c.
(Suggested by Geert Uytterhoeven)
* Better patch description.
v3:
* Fix build problem in ide-cs.c. (Noticed by Stephen Rothwell)
There should be no functional changes caused by this patch.
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This patch removes the old kgdb reminants from ARCH=powerpc and
implements the new style arch specific stub for the common kgdb core
interface.
It is possible to have xmon and kgdb in the same kernel, but you
cannot use both at the same time because there is only one set of
debug hooks.
The arch specific kgdb implementation saves the previous state of the
debug hooks and restores them if you unconfigure the kgdb I/O driver.
Kgdb should have no impact on a kernel that has no kgdb I/O driver
configured.
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
This patch adds the ARCH=arm specific a kgdb backend, originally
written by Deepak Saxena <dsaxena@plexity.net> and George Davis
<gdavis@mvista.com>. Geoff Levand <geoffrey.levand@am.sony.com>,
Nicolas Pitre, Manish Lachwani, and Jason Wessel have contributed
various fixups here as well.
The KGDB patch makes one change to the core ARM architecture such that
the traps are initialized early for use with the debugger or other
subsystems.
[ mingo@elte.hu: small cleanups. ]
[ ben-linux@fluff.org: fixed early_trap_init ]
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Deepak Saxena <dsaxena@plexity.net>
Add support for the following operations to mlx4 when device firmware
supports them:
- Send with invalidate and local invalidate send queue work requests;
- Allocate/free fast register MRs;
- Allocate/free fast register MR page lists;
- Fast register MR send queue work requests;
- Local DMA L_Key.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This adds a hid usage that is reported by the N-Trig digitizer in the Dell
Latitude XT screen.
Signed-off-by: Rafi Rubin <rafi@seas.upenn.edu>
Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This is alternative implementation of sg content iterator introduced
by commit 83e7d317... from Pierre Ossman in next-20080716. As there's
already an sg iterator which iterates over sg entries themselves, name
this sg_mapping_iterator.
Slightly edited description from the original implementation follows.
Iteration over a sg list is not that trivial when you take into
account that memory pages might have to be mapped before being used.
Unfortunately, that means that some parts of the kernel restrict
themselves to directly accesible memory just to not have to deal with
the mess.
This patch adds a simple iterator system that allows any code to
easily traverse an sg list and not have to deal with all the details.
The user can decide to consume part of the iteration. Also, iteration
can be stopped and resumed later if releasing the kmap between
iteration steps is necessary. These features are useful to implement
piecemeal sg copying for interrupt drive PIO for example.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
declared arch_report_meminfo() in asm-x86/pgtable.h as it will be also accessible by fs/proc/proc_misc.c
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
declared do_page_fault() in asm-x86/trap.h for both X86_32 and X86_64
removed do_invalid_op declaration from mm/fault.c as it is already declared in asm-x86/trap.h
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
included <asm/smp.h> in mm/init_32.c for zap_low_mappings()
declared free_initmem() in asm-x86/page_XX.h
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
This driver supports the PCA9550, PCA9551, PCA9552, and PCA9553
LED driver chips.
Signed-off-by: Nate Case <ncase@xes-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
LED classdev core doesn't modify memory pointed by the default_trigger,
so mark it as const and we'll able to pass const char *s without getting
compiler warnings.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
NXP pca9532 is a LED dimmer/controller attached to i2c bus. It allows
attaching upto 16 leds which can either be on, off or dimmed and/or blinked
with the two PWM modulators available.
This driver is a "new-style" i2c driver that adheres to the driver model and
implements the led framework api. Since the leds connected to the driver are
platform specific, it is only useful when platform data is passed to the
driver to define what leds are connected to which pins.
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
This ifdefs are leftovers from the time as the driver was running
on a ppc.
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/trace.c: In function 'tracing_generic_entry_update':
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/trace.c:802: error: implicit declaration of function 'irqs_disabled_flags'
make[3]: *** [kernel/trace/trace.o] Error 1
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/ftrace.c: In function 'ftrace_list_func':
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/ftrace.c:61: error: implicit declaration of function 'read_barrier_depends'
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
the ff1 and bitrev opcode appears in ISA C and ISA A+ what isn't
supported by all plattforms. The assembly optimization is automaticly
enabled if the compiler understand the required cpu keyword.
My m5235 seems to boot and run fine so far.
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (82 commits)
ipw2200: Call netif_*_queue() interfaces properly.
netxen: Needs to include linux/vmalloc.h
[netdrvr] atl1d: fix !CONFIG_PM build
r6040: rework init_one error handling
r6040: bump release number to 0.18
r6040: handle RX fifo full and no descriptor interrupts
r6040: change the default waiting time
r6040: use definitions for magic values in descriptor status
r6040: completely rework the RX path
r6040: call napi_disable when puting down the interface and set lp->dev accordingly.
mv643xx_eth: fix NETPOLL build
r6040: rework the RX buffers allocation routine
r6040: fix scheduling while atomic in r6040_tx_timeout
r6040: fix null pointer access and tx timeouts
r6040: prefix all functions with r6040
rndis_host: support WM6 devices as modems
at91_ether: use netstats in net_device structure
sfc: Create one RX queue and interrupt per CPU package by default
sfc: Use a separate workqueue for resets
sfc: I2C adapter initialisation fixes
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc32: pass -m32 when building vmlinux.lds
sparc: Fixes the DRM layer build on sparc.
ide: merge <asm-sparc/ide_64.h> with <asm-sparc/ide_32.h>
ide: <asm-sparc/ide_64.h>: use __raw_{read,write}w()
ide: <asm-sparc/ide_32.h>: use __raw_{read,write}w()
ide: <asm-sparc/ide_64.h>: use %r0 for outw_be()
sparc64: Do not define BIO_VMERGE_BOUNDARY.
This patch adds to ioatdma and dca modules
support for Intel I/OAT DMA engine ver.3 (aka CB3 device).
The main features of I/OAT ver.3 are:
* 8 single channel DMA devices (8 channels total)
* 8 DCA providers, each can accept 2 requesters
* 8-bit TAG values and 32-bit extended APIC IDs
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Change icmp6_dst_gc to return the one value the caller cares about rather
than using call by reference.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FIB timer list is a trivial size structure, avoid indirection and just
put it in existing ns.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make more PCI PM core functionality available to drivers
* Export pci_pme_capable() so that it can be called directly by
drivers (for example, tg3 needs that).
* Move the state choosing part of pci_prepare_to_sleep() to a
separate function, pci_target_state(), that can be called directly
by drivers (for example, tg3 needs that).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
sctp_outq_flush() can now become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
get_proc_net() can now become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Consumers that want to re-use their QPs in new connections need to
know when the QP has exited the timewait state. Report the timewait
event through the rdma_cm.
Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add an RDMA_CM_EVENT_ADDR_CHANGE event can be used by rdma-cm
consumers that wish to have their RDMA sessions always use the same
links (eg <hca/port>) as the IP stack does. In the current code, this
does not happen when bonding is used and fail-over happened but the IB
link used by an already existing session is operating fine.
Use the netevent notification for sensing that a change has happened
in the IP stack, then scan the rdma-cm ID list to see if there is an
ID that is "misaligned" with respect to the IP stack, and deliver
RDMA_CM_EVENT_ADDR_CHANGE for this ID. The consumer can act on the
event or just ignore it.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The function header comments have to go with the functions
they are documenting, or things go horribly wrong when we
try to process them with the docbook tools.
Warning(include/linux/netdevice.h:1006): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1033): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1067): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1093): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1474): No description found for parameter 'txq'
Error(net/core/dev.c:1674): cannot understand prototype: 'u32 simple_tx_hashrnd; '
Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix crash due to missing debugctlmsr on AMD K6-3
x86: add PTE_FLAGS_MASK
x86: rename PTE_MASK to PTE_PFN_MASK
x86: fix pte_flags() to only return flags, fix lguest (updated)
x86: use setup_clear_cpu_cap with disable_apic, fix
x86: move the last Dprintk instance to pr_debug()
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
remove CONFIG_KMOD from core kernel code
remove CONFIG_KMOD from lib
remove CONFIG_KMOD from sparc64
rework try_then_request_module to do less in non-modular kernels
remove mention of CONFIG_KMOD from documentation
make CONFIG_KMOD invisible
modules: Take a shortcut for checking if an address is in a module
module: turn longs into ints for module sizes
Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs
module: reorder struct module to save space on 64 bit builds
module: generic each_symbol iterator function
module: don't use stop_machine for waiting rmmod
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (49 commits)
powerpc: Fix build bug with binutils < 2.18 and GCC < 4.2
powerpc/eeh: Don't panic when EEH_MAX_FAILS is exceeded
fbdev: Teaches offb about palette on radeon r5xx/r6xx
powerpc/cell/edac: Log a syndrome code in case of correctable error
powerpc/cell: Add DMA_ATTR_WEAK_ORDERING dma attribute and use in Cell IOMMU code
powerpc: Indicate which oprofile counters to use while in compat mode
powerpc/boot: Change spaces to tabs
powerpc: Remove duplicate 6xx option in Kconfig
powerpc: Use PPC_LONG and PPC_LONG_ALIGN in lib/string.S
powerpc: Use PPC_LONG_ALIGN in uaccess.h
powerpc: Add a #define for aligning to a long-sized boundary
powerpc: Fix OF parsing of 64 bits PCI addresses
powerpc: Use WARN_ON(1) instead of __WARN()
powerpc: Fix support for latencytop
powerpc/ps3: Update ps3_defconfig
powerpc/ps3: Add a sub-match id to ps3_system_bus
powerpc: Add a 6xx defconfig
powerpc/dma: Use the struct dma_attrs in iommu code
powerpc/cell: Add support for power button of future IBM cell blades
powerpc/cell: Cleanup sysreset_hack for IBM cell blades
...
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (79 commits)
arm: bus_id -> dev_name() and dev_set_name() conversions
sparc64: fix up bus_id changes in sparc core code
3c59x: handle pci_name() being const
MTD: handle pci_name() being const
HP iLO driver
sysdev: Convert the x86 mce tolerant sysdev attribute to generic attribute
sysdev: Add utility functions for simple int/ulong variable sysdev attributes
sysdev: Pass the attribute to the low level sysdev show/store function
driver core: Suppress sysfs warnings for device_rename().
kobject: Transmit return value of call_usermodehelper() to caller
sysfs-rules.txt: reword API stability statement
debugfs: Implement debugfs_remove_recursive()
HOWTO: change email addresses of James in HOWTO
always enable FW_LOADER unless EMBEDDED=y
uio-howto.tmpl: use unique output names
uio-howto.tmpl: use standard copyright/legal markings
sysfs: don't call notify_change
sysdev: fix debugging statements in registration code.
kobject: should use kobject_put() in kset-example
kobject: reorder kobject to save space on 64 bit builds
...
The mdio_port, mdio_bit, mdc_port and mdc_bit fields in the
fs_mii_bb_platform_info structure are left-overs from the move to the Phy
Abstraction Layer subsystem. They are not used anymore and can be safely
removed.
Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add control of hardware serial bit order between LSB first
(default/standard) and MSB first.
Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some hardware needs to do break handling itself and may have partial
support only. Make break_ctl return an error code. Add a tty driver flag
so you can indicate driver hardware side break support.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
USB serial likes to use port->tty back pointers for the real work it does and
to do so without any actual locking. Unfortunately when you consider hangup
events, hangup/parallel reopen or even worse hangup followed by parallel close
events the tty->port and port->tty pointers are not guaranteed to be the same
as port->tty is the active tty while tty->port is the port the tty may or
may not still be attached to.
So rework the entire API to pass the tty struct. For console cases we need
to pass both for now. This shows up multiple drivers that immediately crash
with USB console some of which have been fixed in the process.
Longer term we need a proper tty as console abstraction
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch consolidates the header guard names which are also used
externally, i.e. in .c files.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
This patch is the result of an automatic script that consolidates the
format of all the headers in include/asm-x86/.
The format:
1. No leading underscore. Names with leading underscores are reserved.
2. Pathname components are separated by two underscores. So we can
distinguish between mm_types.h and mm/types.h.
3. Everything except letters and numbers are turned into single
underscores.
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Maxim's MAX7301 is an SPI GPIO expander with 28 GPIOs. Note: MAX7301's
interrupt feature is not supported yet.
[akpm@linux-foundation.org: coding-style fixes]
[g.liakhovetski@pengutronix.de: Fix inaccuracies in comments, check spi_setup()
return code, mask off high byte in max7301_read()]
Signed-off-by: Juergen Beisert <j.beisert@pengutronix.de>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The Linux kernel puts the filename argument of execve() into the new
address space. Many developers are surprised to learn this. Those who
know and could use it, object "But it's not documented."
Those who want to use it dislike the expression
(char *)(1+ strlen(env[-1+ n_env]) + env[-1+ n_env])
because it requires locating the last original environment variable,
and assumes that the filename follows the characters.
This patch documents the insertion of the filename, and makes it easier
to find by adding a new tag AT_EXECFN in the ElfXX_auxv_t; see <elf.h>.
In many cases readlink("/proc/self/exe",) gives the same answer. But if
all the original pages get unmapped, then the kernel erases the symlink
for /proc/self/exe. This can happen when a program decompressor does a
good job of cleaning up after uncompressing directly to memory, so that
the address space of the target program looks the same as if compression
had never happened. One example is http://upx.sourceforge.net .
One notable use of the underlying concept (what path containED the
executable) is glibc expanding $ORIGIN in DT_RUNPATH. In practice for
the near term, it may be a good idea for user-mode code to use both
/proc/self/exe and AT_EXECFN as fall-back methods for each other.
/proc/self/exe can fail due to unmapping, AT_EXECFN can fail because it
won't be present on non-new systems. The auxvec or {AT_EXECFN}.d_val
also can get overwritten, although in nearly all cases this would be the
result of a bug.
The runtime cost is one NEW_AUX_ENT using two words of stack space. The
underlying value is maintained already as bprm->exec; setup_arg_pages()
in fs/exec.c slides it for stack_shift, etc.
Signed-off-by: John Reiser <jreiser@BitWagon.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Declare time_init() in asm-x86/time.h
Also did cleanup in asm-x86/timer.h :
timer_ack is only required for X86_32
int recalibrate_cpu_khz(void) is for X86_32
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Moved DECLARE_PER_CPU(int, cpu_number) from CONFIG_X86_32_SMP to CONFIG_X86_32
because cpu_number is required for both.
And include asm/smp.h in process_32.c
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Always compile request_module when the kernel allows modules.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This reworks try_then_request_module to only invoke the "lookup"
function "x" once when the kernel is not modular.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This shrinks module.o and each *.ko file.
And finally, structure members which hold length of module
code (four such members there) and count of symbols
are converted from longs to ints.
We cannot possibly have a module where 32 bits won't
be enough to hold such counts.
For one, module loading checks module size for sanity
before loading, so such insanely big module will fail
that test first.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
module.c and module.h conatains code for finding
exported symbols which are declared with EXPORT_UNUSED_SYMBOL,
and this code is compiled in even if CONFIG_UNUSED_SYMBOLS is not set
and thus there can be no EXPORT_UNUSED_SYMBOLs in modules anyway
(because EXPORT_UNUSED_SYMBOL(x) are compiled out to nothing then).
This patch adds required #ifdefs.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
reorder struct module to save space on 64 bit builds.
saves 1 cacheline_size (128 on default x86_64 & 64 on AMD
Opteron/athlon) when CONFIG_MODULE_UNLOAD=y.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
PTE_PFN_MASK was getting lonely, so I made it a friend.
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rusty, in his peevish way, complained that macros defining constants
should have a name which somewhat accurately reflects the actual
purpose of the constant.
Aside from the fact that PTE_MASK gives no clue as to what's actually
being masked, and is misleadingly similar to the functionally entirely
different PMD_MASK, PUD_MASK and PGD_MASK, I don't really see what the
problem is.
But if this patch silences the incessent noise, then it will have
achieved its goal (TODO: write test-case).
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
(Jeremy said:
rusty: use PTE_MASK
rusty: use PTE_MASK
rusty: use PTE_MASK
When I asked:
jsgf: does that include the NX flag?
He responded eloquently:
rusty: use PTE_MASK
rusty: use PTE_MASK
yes, it's the official constant of masking flags out of ptes
)
Change a15af1c9ea 'x86/paravirt: add
pte_flags to just get pte flags' removed lguest's private pte_flags()
in favor of a generic one.
Unfortunately, the generic one doesn't filter out the non-flags bits:
this results in lguest creating corrupt shadow page tables and blowing
up host memory.
Since noone is supposed to use the pfn part of pte_flags(), it seems
safest to always do the filtering.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-and-morning-tea-spilled-by: Ingo Molnar <mingo@elte.hu>
This changes the MTD core to handle pci_name() now returning a constant
string.
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds a new sysdev_ext_attribute that stores a pointer to the variable
it manages and some utility functions/macro to easily use them.
Previously all users wrote custom macros to generate show/store
functions for each variable, with this it is possible to avoid
that in many cases.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This allow to dynamically generate attributes and share show/store
functions between attributes. Right now most attributes are generated
by special macros and lots of duplicated code. With the attribute
passed it's instead possible to attach some data to the attribute
and then use that in shared low level functions to do different things.
I need this for the dynamically generated bank attributes in the x86
machine check code, but it'll allow some further cleanups.
I converted all users in tree to the new show/store prototype. It's a single
huge patch to avoid unbisectable sections.
Runtime tested: x86-32, x86-64
Compiled only: ia64, powerpc
Not compile tested/only grep converted: sh, arm, avr32
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
driver core: Suppress sysfs warnings for device_rename().
Renaming network devices to an already existing name is not
something we want sysfs to print a scary warning for, since the
callers can deal with this correctly. So let's introduce
sysfs_create_link_nowarn() which gets rid of the common warning.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
debugfs_remove_recursive() will remove a dentry and all its children.
Drivers can use this to zap their whole debugfs tree so that they don't
need to keep track of every single debugfs dentry they created.
It may fail to remove the whole tree in certain cases:
sh-3.2# rmmod atmel-mci < /sys/kernel/debug/mmc0/ios/clock
mmc0: card b368 removed
atmel_mci atmel_mci.0: Lost dma0chan1, falling back to PIO
sh-3.2# ls /sys/kernel/debug/mmc0/
ios
But I'm not sure if that case can be handled in any sane manner.
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Pierre Ossman <drzeus-list@drzeus.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
reorder kobject to save space on 64 bit builds.
shrinks from 72 to 64 bytes & moves allocated kobject to a smaller
slab.
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Sometimes it is necessary to enable/disable the interrupt of a UIO device
from the userspace part of the driver. With this patch, the UIO kernel driver
can implement an "irqcontrol()" function that does this. Userspace can write
an s32 value to /dev/uioX (usually 0 or 1 to turn the irq off or on). The
UIO core will then call the driver's irqcontrol function.
Signed-off-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
Acked-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is no such thing as a "device id size" in the driver core, so
remove the define and fix up any users of this odd define in the rest of
the kernel.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There is no such thing as a "device name size" in the driver core, so
remove the define and fix up any users of this odd define in the rest of
the kernel.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Kobjects do not have a limit in name size since a while, so stop
pretending that they do.
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds the infrastructure to properly handle lockdep issues when the
internal class semaphore is changed to a mutex.
Matthew wrote the original patch, and Greg fixed it up to work properly
with the class_create() function.
From: Matthew Wilcox <matthew@wil.cx>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This moves the portions of struct class that are dynamic (kobject and
lock and lists) out of the main structure and into a dynamic, private,
structure.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This mirrors the functionality that driver_find_device has as well.
We add a start variable, and all callers of the function are fixed up at
the same time.
The block layer will be using this new functionality in a follow-on
patch.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This mirrors the functionality that driver_for_each_device has as well.
We add a start variable, and all callers of the function are fixed up at
the same time.
The block layer will be using this new functionality in a follow-on
patch.
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Now that device_create() has been audited, rename things back to the
original call to be sane.
Keep the device_create_drvdata macro around to make merges easier.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
There are no more users of this, and it is racy. Use
device_create_drvdata() or device_create_vargs() instead.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Why?:
There are occasions where userspace would like to access sysfs
attributes for a device but it may not know how sysfs has named the
device or the path. For example what is the sysfs path for
/dev/disk/by-id/ata-ST3160827AS_5MT004CK? With this change a call to
stat(2) returns the major:minor then userspace can see that
/sys/dev/block/8:32 links to /sys/block/sdc.
What are the alternatives?:
1/ Add an ioctl to return the path: Doable, but sysfs is meant to reduce
the need to proliferate ioctl interfaces into the kernel, so this
seems counter productive.
2/ Use udev to create these symlinks: Also doable, but it adds a
udev dependency to utilities that might be running in a limited
environment like an initramfs.
3/ Do a full-tree search of sysfs.
[kay.sievers@vrfy.org: fix duplicate registrations]
[kay.sievers@vrfy.org: cleanup suggestions]
Cc: Neil Brown <neilb@suse.de>
Cc: Tejun Heo <htejun@gmail.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Reviewed-by: SL Baur <steve@xemacs.org>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Mark Lord <lkml@rtr.ca>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Introduce a new dma attriblue DMA_ATTR_WEAK_ORDERING to use weak ordering
on DMA mappings in the Cell processor. Add the code to the Cell's IOMMU
implementation to use this code.
Dynamic mappings can be weakly or strongly ordered on an individual basis
but the fixed mapping has to be either completely strong or completely weak.
This is currently decided by a kernel boot option (pass iommu_fixed=weak
for a weakly ordered fixed linear mapping, strongly ordered is the default).
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use the new PPC_LONG_ALIGN macro instead of passing an argument
to the asm for consistency.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add a #define for aligning to a long-sized boundary. It would be nice
to use sizeof(long) for this, but that requires generating the value
with asm-offsets.c, and asm-offsets.c includes asm-compat.h and we
descend into some sort of recursive include hell.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add sub match id for ps3 system bus so that two different system bus
devices can be connected to a shared device.
Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Update iommu_alloc() to take the struct dma_attrs and pass them on to
tce_build(). This change propagates down to the tce_build functions of
all the platforms.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds support for the power button on future IBM cell blades.
It actually doesn't shut down the machine. Instead it exposes an
input device /dev/input/event0 to userspace which sends KEY_POWER
if power button has been pressed.
haldaemon actually recognizes the button, so a plattform independent acpid
replacement should handle it correctly.
Signed-off-by: Christian Krafft <krafft@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Since commit 7560fa60fc (gpio: <linux/gpio.h>
and "no GPIO support here" stubs) drivers can use GPIOs if they're available,
but don't require them.
This patch actually enables this feature.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use __raw_{read,write}w() in __ide_{in,out}sw()
and remove no longer needed {in,out}w_be().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use __raw_{read,write}w() in __ide_{in,out}sw().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use %r0 for outw_be() to make it match __raw_writew().
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (100 commits)
usb-storage: revert DMA-alignment change for Wireless USB
USB: use reset_resume when normal resume fails
usb_gadget: composite cdc gadget fault handling
usb gadget: minor USBCV fix for composite framework
USB: Fix bug with byte order in isp116x-hcd.c fio write/read
USB: fix double kfree in ipaq in error case
USB: fix build error in cdc-acm for CONFIG_PM=n
USB: remove board-specific UP2OCR configuration from pxa27x-udc
USB: EHCI: Reconciling USB register differences on MPC85xx vs MPC83xx
USB: Fix pointer/int cast in USB devio code
usb gadget: g_cdc dependso on NET
USB: Au1xxx-usb: suspend/resume support.
USB: Au1xxx-usb: clean up ohci/ehci bus glue sources.
usbfs: don't store bad pointers in registration
usbfs: fix race between open and unregister
usbfs: simplify the lookup-by-minor routines
usbfs: send disconnect signals when device is unregistered
USB: Force unbinding of drivers lacking reset_resume or other methods
USB: ohci-pnx4008: I2C cleanups and fixes
USB: debug port converter does not accept more than 8 byte packets
...
This patch (as1024) takes care of a FIXME issue: Drivers that don't
have the necessary suspend, resume, reset_resume, pre_reset, or
post_reset methods will be unbound and their interface reprobed when
one of the unsupported events occurs.
This is made slightly more difficult by the fact that bind operations
won't work during a system sleep transition. So instead the code has
to defer the operation until the transition ends.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch renames the existing usb_reset_device in hub.c to
usb_reset_and_verify_device and renames the existing
usb_reset_composite_device to usb_reset_device. Also the new
usb_reset_and_verify_device does't need to be EXPORTED .
The idea of the patch is that external interface driver
should warn the other interfaces' driver of the same
device before and after reseting the usb device. One interface
driver shoud call _old_ usb_reset_composite_device instead of
_old_ usb_reset_device since it can't assume the device contains
only one interface. The _old_ usb_reset_composite_device
is safe for single interface device also. we rename the two
functions to make the change easily.
This patch is under guideline from Alan Stern.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
From the current implementation of usb_reset_composite_device
function, the iface parameter is no longer useful. This function
doesn't do something special for the iface usb_interface,compared
with other interfaces in the usb_device. So remove the parameter
and fix the related caller.
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1102) clarifies two points in the USB Gadget kerneldoc:
Request completion callbacks are always made with interrupts
disabled;
Device controllers may not support STALLing the status stage
of a control transfer after the data stage is over.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
General cleanup on ir-usb module. Introduced
a common header that could be used also on
usb gadget framework.
Lot's of cleanups and now using macros from the header
file.
Signed-off-by: Felipe Balbi <me@felipebalbi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Add <linux/usb/composite.h> interfaces for composite gadget drivers, and
basic implementation support behind it:
- struct usb_function ... groups one or more interfaces into a function
managed as one unit within a configuration, to which it's added by
usb_add_function().
- struct usb_configuration ... groups one or more such functions into
a configuration managed as one unit by a driver, to which it's added
by usb_add_config(). These operate at either high or full/low speeds
and at a given bMaxPower.
- struct usb_composite_driver ... groups one or more such configurations
into a gadget driver, which may be registered or unregistered.
- struct usb_composite_dev ... a usb_composite_driver manages this; it
wraps the usb_gadget exposed by the controller driver.
This also includes some basic kerneldoc.
How to use it (the short version): provide a usb_composite_driver with a
bind() that calls usb_add_config() for each of the needed configurations.
The configurations in turn have bind() calls, which will usb_add_function()
for each function required. Each function's bind() allocates resources
needed to perform its tasks, like endpoints; sometimes configurations will
allocate resources too.
Separate patches will convert most gadget drivers to this infrastructure.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Define three new descriptor manipulation utilities, for use when
setting up functions that may have multiple instances:
usb_copy_descriptors() to copy a vector of descriptors
usb_free_descriptors() to free the copy
usb_find_endpoint() to find a copied version
These will be used as follows. Functions will continue to have static
tables of descriptors they update, now used as __initdata templates.
When a function creates a new instance, it patches those tables with
relevant interface and string IDs, plus endpoint assignments. Then it
copies those morphed descriptors, associates the copies with the new
function instance, and records the endpoint descriptors to use when
activating the endpoints. When initialization is done, only the copies
remain in memory. The copies are freed on driver removal.
This ensures that each instance has descriptors which hold the right
instance-specific data. Two instances in the same configuration will
obviously never share the same interface IDs or use the same endpoints.
Instances in different configurations won't do so either, which means
this is slightly less memory-efficient in some cases.
This also includes a bugfix to the epautoconf code that shows up with
this usage model. It must replace the previous endpoint number when
updating the template descriptors, not just mask in a few more bits.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch removes CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This patch (as1091) changes the way usbcore handles interface
unbinding. If the interface's driver supports "soft" unbinding (a new
flag in the driver structure) then in-flight URBs are not cancelled
and endpoints are not disabled. Instead the driver is allowed to
continue communicating with the device (although of course it should
stop before its disconnect routine returns).
The purpose of this change is to allow drivers to do a clean shutdown
when they get unbound from a device that is still plugged in. Killing
all the URBs and disabling the endpoints before calling the driver's
disconnect method doesn't give the driver any control over what
happens, and it can leave devices in indeterminate states. For
example, when usb-storage unbinds it doesn't want to stop while in the
middle of transmitting a SCSI command.
The soft_unbind flag is added because in the past, a number of drivers
have experienced problems related to ongoing I/O after their disconnect
routine returned. Hence "soft" unbinding is made available only to
drivers that claim to support it.
The patch also replaces "interface_to_usbdev(intf)" with "udev" in a
couple of places, a minor simplification.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This changes usb_create_hcd() to be able to handle the fact that
pci_name() has changed to a constant string.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: (25 commits)
mmtimer: Push BKL down into the ioctl handler
[IA64] Remove experimental status of kdump
[IA64] Update ia64 mmr list for SGI uv
[IA64] Avoid overflowing ia64_cpu_to_sapicid in acpi_map_lsapic()
[IA64] adding parameter check to module_free()
[IA64] improper printk format in acpi-cpufreq
[IA64] pv_ops: move some functions in ivt.S to avoid lack of space.
[IA64] pvops: documentation on ia64/pv_ops
[IA64] pvops: add to hooks, pv_time_ops, for steal time accounting.
[IA64] pvops: add hooks, pv_irq_ops, to paravirtualized irq related operations.
[IA64] pvops: add hooks, pv_iosapic_ops, to paravirtualize iosapic.
[IA64] pvops: define initialization hooks, pv_init_ops, for paravirtualized environment.
[IA64] pvops: paravirtualize NR_IRQS
[IA64] pvops: paravirtualize entry.S
[IA64] pvops: paravirtualize ivt.S
[IA64] pvops: paravirtualize minstate.h.
[IA64] pvops: define paravirtualized instructions for native.
[IA64] pvops: preparation for paravirtulization of hand written assembly code.
[IA64] pvops: introduce pv_cpu_ops to paravirtualize privileged instructions.
[IA64] pvops: add an early setup hook for pv_ops.
...
As suggested by Dave:
This patch adds a function to get the driver name from a struct net_device,
and consequently uses this in the watchdog timeout handler to print as
part of the message.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IOMMU code and the block layer can split things
up using different rules, so this can't work reliably.
Signed-off-by: David S. Miller <davem@davemloft.net>
There are a couple of places where (P)Dprintk is used which is an old
compile time enabled printk wrapper. Convert it to the generic
pr_debug().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
netfilter: nf_conntrack_sctp: fix sparse warnings
netfilter: nf_nat_sip: c= is optional for session
netfilter: xt_TCPMSS: collapse tcpmss_reverse_mtu{4,6} into one function
netfilter: nfnetlink_log: send complete hardware header
netfilter: xt_time: fix time's time_mt()'s use of do_div()
netfilter: accounting rework: ct_extend + 64bit counters (v4)
netlink: add NLA_PUT_BE64 macro
netfilter: nf_nat_core: eliminate useless find_appropriate_src for IP_NAT_RANGE_PROTO_RANDOM
hdlcdrv: Fix CRC calculation.
Revert "pkt_sched: Make default qdisc nonshared-multiqueue safe."
net: In __netif_schedule() use WARN_ON instead of BUG_ON
net: Improve simple_tx_hash().
pkt_sched: Remove unused variable skb in dev_deactivate_queue function.
sunhme: Remove stop/wake TX queue calls in set-multicast-list handler.
ucc_geth: do not touch net queue in adjust_link phylib callback
gianfar: do not touch net queue in adjust_link phylib callback
atl1: Do not wake queue before queue has been started.
* 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (160 commits)
x86: remove extra calling to get ext cpuid level
x86: use setup_clear_cpu_cap() when disabling the lapic
KVM: fix exception entry / build bug, on 64-bit
x86: add unknown_nmi_panic kernel parameter
x86, VisWS: turn into generic arch, eliminate leftover files
x86: add ->pre_time_init to x86_quirks
x86: extend and use x86_quirks to clean up NUMAQ code
x86: introduce x86_quirks
x86: improve debug printout: add target bootmem range in early_res_to_bootmem()
Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM
x86: remove arch_get_ram_range
x86: Add a debugfs interface to dump PAT memtype
x86: Add a arch directory for x86 under debugfs
x86: i386: reduce boot fixmap space
i386/xen: add proper unwind annotations to xen_sysenter_target
x86: reduce force_mwait visibility
x86: reduce forbid_dac's visibility
x86: fix two modpost warnings
x86: check function status in EDD boot code
x86_64: ia32_signal.c: remove signal number conversion
...
* 'for-linus' of git://neil.brown.name/md: (52 commits)
md: Protect access to mddev->disks list using RCU
md: only count actual openers as access which prevent a 'stop'
md: linear: Make array_size sector-based and rename it to array_sectors.
md: Make mddev->array_size sector-based.
md: Make super_type->rdev_size_change() take sector-based sizes.
md: Fix check for overlapping devices.
md: Tidy up rdev_size_store a bit:
md: Remove some unused macros.
md: Turn rdev->sb_offset into a sector-based quantity.
md: Make calc_dev_sboffset() return a sector count.
md: Replace calc_dev_size() by calc_num_sectors().
md: Make update_size() take the number of sectors.
md: Better control of when do_md_stop is allowed to stop the array.
md: get_disk_info(): Don't convert between signed and unsigned and back.
md: Simplify restart_array().
md: alloc_disk_sb(): Return proper error value.
md: Simplify sb_equal().
md: Simplify uuid_equal().
md: sb_equal(): Fix misleading printk.
md: Fix a typo in the comment to cmd_match().
...
This patch adds some fields to NFLOG to be able to send the complete
hardware header with all necessary informations.
It sends to userspace:
* the type of hardware link
* the lenght of hardware header
* the hardware header
Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Initially netfilter has had 64bit counters for conntrack-based accounting, but
it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are
still required, for example for "connbytes" extension. However, 64bit counters
waste a lot of memory and it was not possible to enable/disable it runtime.
This patch:
- reimplements accounting with respect to the extension infrastructure,
- makes one global version of seq_print_acct() instead of two seq_print_counters(),
- makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n),
- makes it possible to enable/disable it at runtime by sysctl or sysfs,
- extends counters from 32bit to 64bit,
- renames ip_conntrack_counter -> nf_conn_counter,
- enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT),
- set initial accounting enable state based on CONFIG_NF_CT_ACCT
- removes buggy IPCT_COUNTER_FILLING event handling.
If accounting is enabled newly created connections get additional acct extend.
Old connections are not changed as it is not possible to add a ct_extend area
to confirmed conntrack. Accounting is performed for all connections with
acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct".
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add NLA_PUT_BE64 macro required for 64bit counters in netfilter
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a bvec merge function for device mapper devices
for dynamic size restrictions.
This code ensures the requested biovec lies within a single
target and then calls a target-specific function to check
against any constraints imposed by underlying devices.
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
-tip testing found this build bug:
arch/x86/kvm/built-in.o:(.text.fixup+0x1): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0xb): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0x15): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0x1f): relocation truncated to fit: R_X86_64_32 against `.text'
arch/x86/kvm/built-in.o:(.text.fixup+0x29): relocation truncated to fit: R_X86_64_32 against `.text'
Introduced by commit 4ecac3fd. The problem is that 'push' will default
to 32-bit, which is not wide enough as a fixup address. (and which would
crash on any real fixup event even if it was wide enough)
Introduce KVM_EX_PUSH to get the proper address push width on 64-bit too.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
All modifications and most access to the mddev->disks list are made
under the reconfig_mutex lock. However there are three places where
the list is walked without any locking. If a reconfig happens at this
time, havoc (and oops) can ensue.
So use RCU to protect these accesses:
- wrap them in rcu_read_{,un}lock()
- use list_for_each_entry_rcu
- add to the list with list_add_rcu
- delete from the list with list_del_rcu
- delay the 'free' with call_rcu rather than schedule_work
Note that export_rdev did a list_del_init on this list. In almost all
cases the entry was not in the list anymore so it was a no-op and so
safe. It is no longer safe as after list_del_rcu we may not touch
the list_head.
An audit shows that export_rdev is called:
- after unbind_rdev_from_array, in which case the delete has
already been done,
- after bind_rdev_to_array fails, in which case the delete isn't needed.
- before the device has been put on a list at all (e.g. in
add_new_disk where reading the superblock fails).
- and in autorun devices after a failure when the device is on a
different list.
So remove the list_del_init call from export_rdev, and add it back
immediately before the called to export_rdev for that last case.
Note also that ->same_set is sometimes used for lists other than
mddev->list (e.g. candidates). In these cases rcu is not needed.
Signed-off-by: NeilBrown <neilb@suse.de>
Open isn't the only thing that increments ->active. e.g. reading
/proc/mdstat will increment it briefly. So to avoid false positives
in testing for concurrent access, introduce a new counter that counts
just the number of times the md device it open.
Signed-off-by: NeilBrown <neilb@suse.de>
This patch renames the array_size field of struct mddev_s to array_sectors
and converts all instances to use units of 512 byte sectors instead of 1k
blocks.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits)
nfsd: nfs4xdr.c do-while is not a compound statement
nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c
lockd: Pass "struct sockaddr *" to new failover-by-IP function
lockd: get host reference in nlmsvc_create_block() instead of callers
lockd: minor svclock.c style fixes
lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock
lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock
lockd: nlm_release_host() checks for NULL, caller needn't
file lock: reorder struct file_lock to save space on 64 bit builds
nfsd: take file and mnt write in nfs4_upgrade_open
nfsd: document open share bit tracking
nfsd: tabulate nfs4 xdr encoding functions
nfsd: dprint operation names
svcrdma: Change WR context get/put to use the kmem cache
svcrdma: Create a kmem cache for the WR contexts
svcrdma: Add flush_scheduled_work to module exit function
svcrdma: Limit ORD based on client's advertised IRD
svcrdma: Remove unused wait q from svcrdma_xprt structure
svcrdma: Remove unneeded spin locks from __svc_rdma_free
svcrdma: Add dma map count and WARN_ON
...
* 'for-linus' of git://git.o-hand.com/linux-mfd:
mfd: let asic3 use mem resource instead of bus_shift
mfd: remove DS1WM register definitions from asic3.h
mfd: add ASIC3_CONFIG_GPIO templates
mfd: fix the asic3 irq demux code
mfd: asic3 should depend on gpiolib
mfd: fix asic3 config array initialisation
mfd: move asic3 probe functions into __init section
mfd: Use uppercase only for asic3 macros and defines
mfd: use dev_* macros for asic3 debugging
mfd: New asic3 gpio configuration code
mfd: asic3 children platform data removal
mfd: asic3 gpiolib support
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (70 commits)
KVM: Adjust smp_call_function_mask() callers to new requirements
KVM: MMU: Fix potential race setting upper shadow ptes on nonpae hosts
KVM: x86 emulator: emulate clflush
KVM: MMU: improve invalid shadow root page handling
KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction
KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues
KVM: check injected pic irq within valid pic irqs
KVM: x86 emulator: Fix HLT instruction
KVM: Apply the kernel sigmask to vcpus blocked due to being uninitialized
KVM: VMX: Add ept_sync_context in flush_tlb
KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held
x86: KVM guest: make kvm_smp_prepare_boot_cpu() static
KVM: SVM: fix suspend/resume support
KVM: s390: rename private structures
KVM: s390: Set guest storage limit and offset to sane values
KVM: Fix memory leak on guest exit
KVM: s390: dont allocate dirty bitmap
KVM: move slots_lock acquision down to vapic_exit
KVM: VMX: Fake emulate Intel perfctr MSRs
KVM: VMX: Fix a wrong usage of vmcs_config
...
The stab bits can't be referenced uniless the full
packet scheduler layer is enabled.
Reported by Stephen Rothwell.
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (1232 commits)
iucv: Fix bad merging.
net_sched: Add size table for qdiscs
net_sched: Add accessor function for packet length for qdiscs
net_sched: Add qdisc_enqueue wrapper
highmem: Export totalhigh_pages.
ipv6 mcast: Omit redundant address family checks in ip6_mc_source().
net: Use standard structures for generic socket address structures.
ipv6 netns: Make several "global" sysctl variables namespace aware.
netns: Use net_eq() to compare net-namespaces for optimization.
ipv6: remove unused macros from net/ipv6.h
ipv6: remove unused parameter from ip6_ra_control
tcp: fix kernel panic with listening_get_next
tcp: Remove redundant checks when setting eff_sacks
tcp: options clean up
tcp: Fix MD5 signatures for non-linear skbs
sctp: Update sctp global memory limit allocations.
sctp: remove unnecessary byteshifting, calculate directly in big-endian
sctp: Allow only 1 listening socket with SO_REUSEADDR
sctp: Do not leak memory on multiple listen() calls
sctp: Support ipv6only AF_INET6 sockets.
...
* 'for-linus' of git://www.jni.nu/cris:
[CRISv10] Clean up compressed/misc.c
[CRISv10] Correct whitespace damage.
[CRIS] Correct definition of subdirs for install_headers.
[CRIS] Correct image makefiles to allow using a separate OBJ-directory.
[CRIS] Build fixes for compressed and rescue images for v10 and v32:
It looks at least odd to apply spin_unlock to a mutex.
cris: compile fixes for 2.6.26-rc5
This patch contains the following possible cleanups:
- amiints.c: add a proper prototype for amiga_init_IRQ() in
include/asm-m68k/amigaints.h
- make the following needlessly global code static:
- config.c: amiga_model
- config.c: amiga_psfreq
- config.c: amiga_serial_console_write()
- #if 0 the following unused functions:
- config.c: amiga_serial_puts()
- config.c: amiga_serial_console_wait_key()
- config.c: amiga_serial_gets()
- remove the following unused variable:
- config.c: amiga_masterclock
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Unless I miss something that's code for a sparc machine even the sparc
code no longer supports that got copied to m68k when these files were
copied.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow no CPU/platform type for allnoconfig
- Provide a dummy value for FPSTATESIZE if no CPU type was selected
- Provide a dummy value for NR_IRQS if no platform type was selected
- Warn the user if no CPU or platform type was selected
Note: you still cannot build an allnoconfig kernel, as CONFIG_SWAP=n doesn't
build and we cannot easily fix that
(http://groups.google.com/group/linux.kernel/browse_thread/thread/d430c78b07e1827b)
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch removes CVS keywords that weren't updated for a long time
from comments.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'configfs-fixup-ptr-error' of git://oss.oracle.com/git/jlbec/linux-2.6:
configfs: Allow ->make_item() and ->make_group() to return detailed errors.
Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors."
Fix up the termios of the people who have not yet got with the program
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch cyclades to use the new tty_port structure
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch the stallion driver to use the tty_port structure
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch istallion to use the new tty_port structure
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch drivers using the old "generic serial" driver to use the tty_port
structures
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Switch the serial_core based drivers to use the new tty_port structure.
We can't quite use all of it yet because of the dynamically allocated
extras in the serial_core layer.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Every tty driver has its own concept of a port structure and because
they all differ we cannot extract commonality. Begin fixing this by
creating a structure drivers can elect to use so that over time we can
push fields into this and create commonality and then introduce common
methods.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the line disciplines towards a conventional ->ops arrangement. For
the moment the actual 'tty_ldisc' struct in the tty is kept as part of
the tty struct but this can then be changed if it turns out that when it
all settles down we want to refcount ldiscs separately to the tty.
Pull the ldisc code out of /proc and put it with our ldisc code.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Serial drivers using DMA (like the atmel_serial driver) tend to get very
confused when the xmit buffer is flushed and nobody told them. They
also tend to spew a lot of garbage since the DMA engine keeps running
after the buffer is flushed and possibly refilled with unrelated data.
This patch adds a new flush_buffer operation to the uart_ops struct,
along with a call to it from uart_flush_buffer() right after the xmit
buffer has been cleared. The driver can implement this in order to
syncronize its internal DMA state with the xmit buffer when the buffer
is flushed.
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The bus_shift parameter in platform_data is not needed
as we can tell the driver with the IOMEM_RESOURCE whether
the ASIC is located on a 16bit or 32bit memory bus.
The htc-egpio driver uses a more descriptive bus_width parameter,
but for drivers where the register map size fixed, we don't even
need this.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
There is a dedicated ds1wm driver, no need to duplicate this
information here.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
As ASIC3 GPIO alternate function configuration is expected to be similar
for several devices, it is convenient to define descriptive macros. This
patch is inspired by the PXA MFP configuration, the alternate functions
were observed on hx4700 and blueangel.
Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Let's be consistent and use uppercase only, for both macro and defines.
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The ASIC3 GPIO configuration code is a bit obscure and hardly readable.
This patch changes it so that it is now more readable and understandable,
by being more explicit.
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Platform devices should be dynamically allocated, and each supported
device should have its own platform data.
For now we just remove this buggy code.
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ASIC3 is, among other things, a GPIO extender. We should thus have it
supporting the current gpiolib API.
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
SYS_SUPPORTS_64BIT_KERNEL is enabled for RBTX4927/RBTX4938, but
actually it was broken for long time (or from the beginning). Now it
should work.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
include/asm-mips/mips-boards/sead{,int}.h are now obsolete.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[...]
CC init/main.o
include/asm/bitops.h: In function `start_kernel':
include/asm/bitops.h:76: warning: asm operand 2 probably doesn't match
constraints
include/asm/bitops.h:76: warning: asm operand 2 probably doesn't match
constraints
include/asm/bitops.h:76: warning: asm operand 2 probably doesn't match
constraints
include/asm/bitops.h:76: error: impossible constraint in `asm'
include/asm/bitops.h:76: error: impossible constraint in `asm'
include/asm/bitops.h:76: error: impossible constraint in `asm'
make[1]: *** [init/main.o] Error 1
[...]
The build error is caused by the ages old gcc bug where gcc at the time of
analyzing the constraints is unable to figure out that an "i" constraint
actually can be satisfied and thus will abort unless an "r" is added to
the constraint. For the actual code generation gcc will only ever use the
"i" constraint.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch fixes the following sparse warnings:
>>>>>>>>>>>>>>>>>>
arch/mips/mm/page.c:284:16: warning: symbol
'build_clear_page' was not declared. Should it be static?
arch/mips/mm/page.c:426:16: warning: symbol 'build_copy_page'
was not declared. Should it be static?
>>>>>>>>>>>>>>>>>>
The fix is to add appropriate prototypes to the header
include/asm-mips/page.h.
Build-tested against Malta defconfig.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
While building the Malta defconfig, sparse spat the following
warnings:
>>>>>>>>>>>>>>>>>>
arch/mips/math-emu/kernel_linkage.c:31:6: warning: symbol
'fpu_emulator_init_fpu' was not declared. Should it be static?
arch/mips/math-emu/kernel_linkage.c:54:5: warning: symbol
'fpu_emulator_save_context' was not declared. Should it be
static?
arch/mips/math-emu/kernel_linkage.c:68:5: warning: symbol
'fpu_emulator_restore_context' was not declared. Should it be
static?
>>>>>>>>>>>>>>>>>>
This patch fixes these errors by adding the proper prototypes
to the include/asm-mips/fpu.h header, and actually using this
header in the sparse-spotted source file.
Build-tested with Malta defconfig.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The pcibios_max_latency variable is needlessly defined global, and this
patch makes it static.
Build-tested using malta_defconfig.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Currently, saa7134 is dependent of ir-kbd-i2c, since it uses a symbol that is
defined there. However, as this symbol is used only on saa7134, there's no
sense on keeping it defined there (or on ir-commons).
So, let's move it to saa7134 and remove one symbol for being exported.
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Those changes, together with some proper patches, will allow out-of-tree
compilation for for kernels < 2.6.19
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
This patch adds a simple platform camera device. Useful for testing
cameras with SoC camera host drivers. Only one single pixel format
and resolution combination is supported.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
This is V3 of the SuperH Mobile CEU soc_camera driver.
The CEU hardware block is configured in a transparent data fetch
mode, frames are captured from the attached camera and written to
physically contiguous memory buffers provided by the newly added
videobuf-dma-contig queue. Tested on sh7722 and sh7723 processors.
Changes since V2:
- remove SUPERH Kconfig dependency
- move sh_mobile_ceu.h to include/media
- add board callback support with enable_camera()/disable_camera()
- add support for declare_coherent_memory
- rework video memory limit
- more verbose error messages
Changes since V1:
- fixed the CEU driver to work with the newly updated patches
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
This is V3 of the physically contiguous videobuf queues patch.
Useful for hardware such as the SuperH Mobile CEU which doesn't
support scatter gatter bus mastering.
Since it may be difficult to allocate large chunks of physically
contiguous memory after some uptime due to fragmentation, this code
allocates memory using dma_alloc_coherent(). Architectures supporting
dma_declare_coherent_memory() can easily avoid fragmentation issues
by using dma_declare_coherent_memory() to force dma_alloc_coherent()
to allocate from a certain pre-allocated memory area.
Changes since V2
- use dma_handle for physical address
- use "scatter gather" instead of "scatter gatter"
Changes since V1:
- use dev_err() instead of pr_err()
- remember size in struct videobuf_dma_contig_memory
- keep struct videobuf_dma_contig_memory in .c file
- let videobuf_to_dma_contig() return dma_addr_t
- implement __videobuf_sync()
- return statements, white space and other minor fixes
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The SuperH Mobile CEU hardware supports 16-bit width bus,
so extend the soc_camera code with SOCAM_DATAWIDTH_16.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
This patch moves the spinlock handling from soc_camera.c to the actual
camera host driver. The spinlock_alloc/free callbacks are replaced with
code in init_videobuf(). So far all camera host drivers implement their
own spinlock_alloc/free methods anyway, and videobuf_queue_core_init()
BUGs on a NULL spinlock argument, so, new camera host drivers will not
forget to provide a spinlock when initialising their videobuf queues.
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Makes SoC camera videobuf independent. Includes all necessary changes for
PXA camera driver (currently the only driver using soc_camera in the mainline).
These changes are important for the future soc_camera based drivers.
Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The tvaudio driver is using "official" I2C device IDs for internal
purpose. There must be some historical reason behind this but anyway,
it shouldn't do that. As the stored values are never used, the easiest
way to fix the problem is simply to remove them altogether.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
I2C_HW_SMBUS_OVFX2 is referenced in ovcamchip_core.c, but no bus uses
this driver ID, so we can remove the reference. As far as I can see,
the Cypress FX2 webcam is handled by a different driver (dvb-usb).
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Also remove some blank lines that were used to split compat code at -devel
tree.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
videodev2: New pixfmt
pac207: Remove the specific decoding.
main: get_buff_size operation added for the subdriver.
Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The low (half) res modes of the spca561 are not spca561 compressed, but are
raw bayer, this patches fixes this and adds a PIX_FMT define for the GBRG
bayer format used by the spca561 in low res mode.
Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Annotations + stop saa7146_i2c from playing fast and loose with
reuse of ->cpu_addr for host-endian.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The cx18 can support transport streams with newer firmwares. Add a TS
capability to the generic cx2341x module.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Various ioctl debugging fixes and improvements:
- use %x rather than %d for control IDs and bitmask fields
- make two arrays const
- show the whole control array for the ext_ctrl ioctls
- print pix_fmt for V4L2_BUF_TYPE_VIDEO_OUTPUT
- show full type name rather than an integer
- fix CROPCAP debugging
- fix G/S_TUNER debugging
- show error code in case of an error
- other small cleanups
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
A number of V4L drivers have a mod param to specify their preferred minors.
This is because it is often desirable for applications to have a static /dev
name for a particular device. However, using minors has several disadvantages:
1) the requested minor may already be taken
2) using a mod param is driver specific
3) it requires every driver to add a param
4) requires configuration by hand
This patch introduces an "index" attribute that when combined with udev rules
can create static device paths like this:
/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video0
/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video1
/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video2
$ ls -la /dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video0
lrwxrwxrwx 1 root root 12 2008-04-28 00:02 /dev/v4l/by-path/pci-0000:00:1d.2-usb-0:1:1.0-video0 -> ../../video1
These paths are steady across reboots and should be resistant to rearranging
across Kernel versions.
video_register_device_index is available to drivers to request a
specific index number.
Signed-off-by: Brandon Philips <bphilips@suse.de>
Signed-off-by: Kees Cook <kees@outflux.net>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The naming for the callbacks that handle the VIDIOC_ENUM_FMT and
VIDIOC_S/G/TRY_FMT ioctls was very confusing. Renamed it to match
the v4l2_buf_type name.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
There was no vidioc_try_fmt_sliced_vbi_output, instead vidioc_try_fmt_vbi_output
was reused.
The VIDIOC_ENUMOUTPUT handling was missing altogether, even though the callback
existed.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
The default videodev behavior for VIDIOC_G_STD is not correct for all devices.
Add a new callback that drivers can use instead.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Flush the shadow mmu before removing regions to avoid stale entries.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
While doing some tests with our lcrash implementation I have seen a
naming conflict with prefix_info in kvm_host.h vs. addrconf.h
To avoid future conflicts lets rename private definitions in
asm/kvm_host.h by adding the kvm_s390 prefix.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Instead of prefetching all segment bases before emulation, read them at the
last moment. Since most of them are unneeded, we save some cycles on
Intel machines where this is a bit expensive.
Signed-off-by: Avi Kivity <avi@qumranet.com>
rip relative decoding is relative to the instruction pointer of the next
instruction; by moving address adjustment until after decoding is complete,
we remove the need to determine the instruction size.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Currently kvmtrace is not portable. This will prevent from copying a
trace file from big-endian target to little-endian workstation for analysis.
In the patch, kernel outputs metadata containing a magic number to trace
log, and changes 64-bit words to be u64 instead of a pair of u32s.
Signed-off-by: Tan Li <li.tan@intel.com>
Acked-by: Jerone Young <jyoung5@us.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch enables coalesced MMIO for ia64 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.
[akpm: fix compile error on ia64]
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch enables coalesced MMIO for powerpc architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch enables coalesced MMIO for x86 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch adds all needed structures to coalesce MMIOs.
Until an architecture uses it, it is not compiled.
Coalesced MMIO introduces two ioctl() to define where are the MMIO zones that
can be coalesced:
- KVM_REGISTER_COALESCED_MMIO registers a coalesced MMIO zone.
It requests one parameter (struct kvm_coalesced_mmio_zone) which defines
a memory area where MMIOs can be coalesced until the next switch to
user space. The maximum number of MMIO zones is KVM_COALESCED_MMIO_ZONE_MAX.
- KVM_UNREGISTER_COALESCED_MMIO cancels all registered zones inside
the given bounds (bounds are also given by struct kvm_coalesced_mmio_zone).
The userspace client can check kernel coalesced MMIO availability by asking
ioctl(KVM_CHECK_EXTENSION) for the KVM_CAP_COALESCED_MMIO capability.
The ioctl() call to KVM_CAP_COALESCED_MMIO will return 0 if not supported,
or the page offset where will be stored the ring buffer.
The page offset depends on the architecture.
After an ioctl(KVM_RUN), the first page of the KVM memory mapped points to
a kvm_run structure. The offset given by KVM_CAP_COALESCED_MMIO is
an offset to the coalesced MMIO ring expressed in PAGE_SIZE relatively
to the address of the start of th kvm_run structure. The MMIO ring buffer
is defined by the structure kvm_coalesced_mmio_ring.
[akio: fix oops during guest shutdown]
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Modify member in_range() of structure kvm_io_device to pass length and the type
of the I/O (write or read).
This modification allows to use kvm_io_device with coalesced MMIO.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Prefixes functions that will be exported with kvm_.
We also prefixed set_segment() even if it still static
to be coherent.
signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
Add emulation for the memory type range registers, needed by VMware esx 3.5,
and by pci device assignment.
Signed-off-by: Avi Kivity <avi@qumranet.com>
KVM turns off hardware virtualization extensions during reboot, in order
to disassociate the memory used by the virtualization extensions from the
processor, and in order to have the system in a consistent state.
Unfortunately virtual machines may still be running while this goes on,
and once virtualization extensions are turned off, any virtulization
instruction will #UD on execution.
Fix by adding an exception handler to virtualization instructions; if we get
an exception during reboot, we simply spin waiting for the reset to complete.
If it's a true exception, BUG() so we can have our stack trace.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The KVM MMU tries to detect when a speculative pte update is not actually
used by demand fault, by checking the accessed bit of the shadow pte. If
the shadow pte has not been accessed, we deem that page table flooded and
remove the shadow page table, allowing further pte updates to proceed
without emulation.
However, if the pte itself points at a page table and only used for write
operations, the accessed bit will never be set since all access will happen
through the emulator.
This is exactly what happens with kscand on old (2.4.x) HIGHMEM kernels.
The kernel points a kmap_atomic() pte at a page table, and then
proceeds with read-modify-write operations to look at the dirty and accessed
bits. We get a false flood trigger on the kmap ptes, which results in the
mmu spending all its time setting up and tearing down shadows.
Fix by setting the shadow accessed bit on emulated accesses.
Signed-off-by: Avi Kivity <avi@qumranet.com>
To distinguish between real page faults and nested page faults they should be
traced as different events. This is implemented by this patch.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
random uvesafb failures were reported against Gentoo:
http://bugs.gentoo.org/show_bug.cgi?id=222799
and Mihai Moldovan bisected it back to:
> 8f4d37ec07 is first bad commit
> commit 8f4d37ec07
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Fri Jan 25 21:08:29 2008 +0100
>
> sched: high-res preemption tick
Linus suspected it to be hrtick + vm86 interaction and observed:
> Btw, Peter, Ingo: I think that commit is doing bad things. They aren't
> _incorrect_ per se, but they are definitely bad.
>
> Why?
>
> Using random _TIF_WORK_MASK flags is really impolite for doing
> "scheduling" work. There's a reason that arch/x86/kernel/entry_32.S
> special-cases the _TIF_NEED_RESCHED flag: we don't want to exit out of
> vm86 mode unnecessarily.
>
> See the "work_notifysig_v86" label, and how it does that
> "save_v86_state()" thing etc etc.
Right, I never liked having to fiddle with those TIF flags. Initially I
needed it because the hrtimer base lock could not nest in the rq lock.
That however is fixed these days.
Currently the only reason left to fiddle with the TIF flags is remote
wakeups. We cannot program a remote cpu's hrtimer. I've been thinking
about using the new and improved IPI function call stuff to implement
hrtimer_start_on().
However that does require that smp_call_function_single(.wait=0) works
from interrupt context - /me looks at the latest series from Jens - Yes
that does seem to be supported, good.
Here's a stab at cleaning this stuff up ...
Mihai reported test success as well.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Mihai Moldovan <ionic@ionic.de>
Cc: Michal Januszewski <spock@gentoo.org>
Cc: Antonino Daplas <adaplas@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Rename CPUMASK_VAR --> CPUMASK_PTR (and simplify)
* Fix a semantic error in CPUMASK_ALLOC
* Add a bit of commentry to cpumask.h
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
so NUMAQ can use that to call numaq_pre_time_init()
This allows us to remove a NUMAQ special from arch/x86/kernel/setup.c.
(and paves the way to remove the NUMAQ subarch)
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
add these new x86_quirks methods:
int *mpc_record;
int (*mpc_apic_id)(struct mpc_config_processor *m);
void (*mpc_oem_bus_info)(struct mpc_config_bus *m, char *name);
void (*mpc_oem_pci_bus)(struct mpc_config_bus *m);
void (*smp_read_mpc_oem)(struct mp_config_oemtable *oemtable,
unsigned short oemsize);
... and move NUMAQ related mps table handling to numaq_32.c.
also move the call to smp_read_mpc_oem() to smp_read_mpc() directly.
Should not change functionality, albeit it would be nice to get it
tested on real NUMAQ as well ...
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
introduce x86_quirks array of boot-time quirk methods.
No change in functionality intended.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add size table functions for qdiscs and calculate packet size in
qdisc_enqueue().
Based on patch by Patrick McHardy
http://marc.info/?l=linux-netdev&m=115201979221729&w=2
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use sockaddr_storage{} for generic socket address storage
and ensures proper alignment.
Use sockaddr{} for pointers to omit several casts.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove redundant checks when setting eff_sacks and make the number of SACKs a
compile time constant. Now that the options code knows how many SACK blocks can
fit in the header, we don't need to have the SACK code guessing at it.
Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This should fix the following bugs:
* Connections with MD5 signatures produce invalid packets whenever SACK
options are included
* MD5 signatures are counted twice in the MSS calculations
Behaviour changes:
* A SYN with MD5 + SACK + TS elicits a SYNACK with MD5 + SACK
This is because we can't fit any SACK blocks in a packet with MD5 + TS
options. There was discussion about disabling SACK rather than TS in
order to fit in better with old, buggy kernels, but that was deemed to
be unnecessary.
* SYNs with MD5 don't include a TS option
See above.
Additionally, it removes a bunch of duplicated logic for calculating options,
which should help avoid these sort of issues in the future.
Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the MD5 code assumes that the SKBs are linear and, in the case
that they aren't, happily goes off and hashes off the end of the SKB and
into random memory.
Reported by Stephen Hemminger in [1]. Advice thanks to Stephen and Evgeniy
Polyakov. Also includes a couple of missed route_caps from Stephen's patch
in [2].
[1] http://marc.info/?l=linux-netdev&m=121445989106145&w=2
[2] http://marc.info/?l=linux-netdev&m=121459157816964&w=2
Signed-off-by: Adam Langley <agl@imperialviolet.org>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the metrics (RTT, RTTVAR and RTAX_RTO_MIN) are stored in
kernel units (jiffies) and this leaks out through the netlink API to
user space where the units for jiffies are unknown.
This patches changes the kernel to convert to/from milliseconds. This
changes the ABI, but milliseconds seemed like the most natural unit
for these parameters. Values available via syscall in
/proc/net/rt_cache and netlink will be in milliseconds.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Idea is from Patrick McHardy.
Instead of managing the list of qdiscs on the device level, manage it
in the root qdisc of a netdev_queue. This solves all kinds of
visibility issues during qdisc destruction.
The way to iterate over all qdiscs of a netdev_queue is to visit
the netdev_queue->qdisc, and then traverse it's list.
The only special case is to ignore builting qdiscs at the root when
dumping or doing a qdisc_lookup(). That was not needed previously
because builtin qdiscs were not added to the device's qdisc_list.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a SW_DOCK switch to input.h. ACPI docks currently send their docking
status as a uevent, but not all docks are ACPI or correspond to a device.
In that case, it makes more sense to simply generate an input event on
docking or undocking.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Add a new switch type to the input API for reporting microphone
insertion. This will be used by the ALSA jack reporting API.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
The u32_list is just an indirect way of maintaining a reference
to a U32 node on a per-qdisc basis.
Just add an explicit node pointer for u32 to struct Qdisc an do
away with this global list.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new sockopt to reserve some headroom in the mmaped ring frames in
front of the packet payload. This can be used f.i. when the VLAN header
needs to be (re)constructed to avoid moving the entire payload.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a directory for x86 arch under debugfs. Can be used to accumulate all
x86 specific debugfs files.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
As 256 entries are needed, aligning to a 256-entry boundary is
sufficient and still guarantees the single pte table requirement.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It's not used anywhere outside its single referencing file.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* Provide a generic set of CPUMASK_ALLOC macros patterned after the
SCHED_CPUMASK_ALLOC macros. This is used where multiple cpumask_t
variables are declared on the stack to reduce the amount of stack
space required.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* This patch replaces the dangerous lvalue version of cpumask_of_cpu
with new cpumask_of_cpu_ptr macros. These are patterned after the
node_to_cpumask_ptr macros.
In general terms, if there is a cpumask_of_cpu_map[] then a pointer to
the cpumask_of_cpu_map[cpu] entry is used. The cpumask_of_cpu_map
is provided when there is a large NR_CPUS count, reducing
greatly the amount of code generated and stack space used for
cpumask_of_cpu(). The pointer to the cpumask_t value is needed for
calling set_cpus_allowed_ptr() to reduce the amount of stack space
needed to pass the cpumask_t value.
If there isn't a cpumask_of_cpu_map[], then a temporary variable is
declared and filled in with value from cpumask_of_cpu(cpu) as well as
a pointer variable pointing to this temporary variable. Afterwards,
the pointer is used to reference the cpumask value. The compiler
will optimize out the extra dereference through the pointer as well
as the stack space used for the pointer, resulting in identical code.
A good example of the orthogonal usages is in net/sunrpc/svc.c:
case SVC_POOL_PERCPU:
{
unsigned int cpu = m->pool_to[pidx];
cpumask_of_cpu_ptr(cpumask, cpu);
*oldmask = current->cpus_allowed;
set_cpus_allowed_ptr(current, cpumask);
return 1;
}
case SVC_POOL_PERNODE:
{
unsigned int node = m->pool_to[pidx];
node_to_cpumask_ptr(nodecpumask, node);
*oldmask = current->cpus_allowed;
set_cpus_allowed_ptr(current, nodecpumask);
return 1;
}
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Declaring x86 traps under one hood.
Declaring x86 do_traps before defining them.
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The force_mwait variable iss defined either in
arch/x86/kernel/cpu/amd.c or in arch/x86/kernel/setup_64.c, but it is
only initialized and used in arch/x86/kernel/process.c. This patch
moves the declaration to arch/x86/kernel/process.c.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: michael@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:
scheduler switch to idle task
enable interrupts
Window starts here
----> interrupt happens (does not set NEED_RESCHED)
irq_exit() stops the tick
----> interrupt happens (does set NEED_RESCHED)
return from schedule()
cpu_idle(): preempt_disable();
Window ends here
The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.
The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.
Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.
cpu_idle()
{
preempt_disable();
while(1) {
tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
are in the idle loop
while (!need_resched())
halt();
tick_nohz_restart_sched_tick(); <- disables NOHZ mode
preempt_enable_no_resched();
schedule();
preempt_disable();
}
}
In hindsight we should have done this forever, but ...
/me grabs a large brown paperbag.
Debugged-by: Jack Ren <jack.ren@marvell.com>,
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Fallout from commit 33185c504f ("x86:
merge signal_32/64.h")
Thanks to Dick Streefland who provided an useful testcase on
http://lkml.org/lkml/2008/3/17/205 (only applicable to 2.6.24.x), that
helped a lot as a deterministic way to bisect an issue that leaded to
this fix.
Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Cc: Roland McGrath <roland@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is v2, it's a little deference from v1 that I
had send to lkml.
use ACCESS_ONCE
use rcu_batch_after/rcu_batch_before for batch # comparison.
rcutorture test result:
(hotplugs: do cpu-online/offline once per second)
No CONFIG_NO_HZ: OK, 12hours
No CONFIG_NO_HZ, hotplugs: OK, 12hours
CONFIG_NO_HZ=y: OK, 24hours
CONFIG_NO_HZ=y, hotplugs: Failed.
(Failed also without my patch applied, exactly the same bug occurred,
http://lkml.org/lkml/2008/7/3/24)
v1's email thread:
http://lkml.org/lkml/2008/6/2/539
v1's description:
The code/algorithm of the implement of current callbacks-processing
is very efficient and technical. But when I studied it and I found
a disadvantage:
In multi-CPU systems, when a new RCU callback is being
queued(call_rcu[_bh]), this callback will be invoked after the grace
period for the batch with batch number = rcp->cur+2 has completed
very very likely in current implement. Actually, this callback can be
invoked after the grace period for the batch with
batch number = rcp->cur+1 has completed. The delay of invocation means
that latency of synchronize_rcu() is extended. But more important thing
is that the callbacks usually free memory, and these works are delayed
too! it's necessary for reclaimer to free memory as soon as
possible when left memory is few.
A very simple way can solve this problem:
a field(struct rcu_head::batch) is added to record the batch number for
the RCU callback. And when a new RCU callback is being queued, we
determine the batch number for this callback(head->batch = rcp->cur+1)
and we move this callback to rdp->donelist if we find
that head->batch <= rcp->completed when we process callbacks.
This simple way reduces the wait time for invocation a lot. (about
2.5Grace Period -> 1.5Grace Period in average in multi-CPU systems)
This is my algorithm. But I do not add any field for struct rcu_head
in my implement. We just need to memorize the last 2 batches and
their batch number, because these 2 batches include all entries that
for whom the grace period hasn't completed. So we use a special
linked-list rather than add a field.
Please see the comment of struct rcu_data.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Gautham Shenoy <ego@in.ibm.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
use a batch number(rcp->pending) instead of a flag(rcp->next_pending)
rcu_start_batch() need to change this flag, so mb()s is needed
for memory-access safe.
but(after this patch applied) rcu_start_batch() do not change
this batch number(rcp->pending), rcp->pending is managed by
__rcu_process_callbacks only, and troublesome mb()s are eliminated.
And codes look simpler and clearer.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Gautham Shenoy <ego@in.ibm.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Real-time code needs to know the number of cycles per second
on SGI UV. The information is provided via a run time BIOS
call. This patch provides the linux side of that interface.
This is the first of several run time BIOS calls to be defined
in uv/bios.h and bios_uv.c.
Note that BIOS_CALL() is just a stub for now. The bios
side is being worked on.
Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ricardo M. Correia spotted that the use of __fls() in fls64() did
not seem to make sense. In fact fls64()'s implementation is fine,
but the description of __fls() was wrong. Fix that.
Reported-by: "Ricardo M. Correia" <Ricardo.M.Correia@Sun.COM>
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
[ mingo@elte.hu: picked up this patch from Maciej, lets make apic=debug
print out more info - we had a lot of APIC changes ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As a microoptimisation, make apic_verbosity unsigned. This will make
apic_printk(APIC_QUIET, ...) expand into just printk(...) with the
surrounding condition and a reference to apic_verbosity removed.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
it's separate functionality that deserves its own file.
This also prepares 32-bit memtest support.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
provide an empty partition_sched_domains() definition for the UP case:
include/linux/cpuset.h: In function ‘rebuild_sched_domains':
include/linux/cpuset.h:163: error: implicit declaration of function ‘partition_sched_domains'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is based on Linus' idea of creating cpu_active_map that prevents
scheduler load balancer from migrating tasks to the cpu that is going
down.
It allows us to simplify domain management code and avoid unecessary
domain rebuilds during cpu hotplug event handling.
Please ignore the cpusets part for now. It needs some more work in order
to avoid crazy lock nesting. Although I did simplfy and unify domain
reinitialization logic. We now simply call partition_sched_domains() in
all the cases. This means that we're using exact same code paths as in
cpusets case and hence the test below cover cpusets too.
Cpuset changes to make rebuild_sched_domains() callable from various
contexts are in the separate patch (right next after this one).
This not only boots but also easily handles
while true; do make clean; make -j 8; done
and
while true; do on-off-cpu 1; done
at the same time.
(on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing).
Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing
this on right now in gnome-terminal and things are moving just fine.
Also this is running with most of the debug features enabled (lockdep,
mutex, etc) no BUG_ONs or lockdep complaints so far.
I believe I addressed all of the Dmitry's comments for original Linus'
version. I changed both fair and rt balancer to mask out non-active cpus.
And replaced cpu_is_offline() with !cpu_active() in the main scheduler
code where it made sense (to me).
Signed-off-by: Max Krasnyanskiy <maxk@qualcomm.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Gregory Haskins <ghaskins@novell.com>
Cc: dmitry.adamushko@gmail.com
Cc: pj@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are already 7 of them - time to kill some duplicate code.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Proc temporary uses stats from init_net.
BTW, TCP_XXX_STATS are beautiful (w/o do { } while (0) facing) again :)
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only structure declared within is the netns_mib, which will
carry all our mibs within. I didn't put the mibs in the existing
netns_xxx structures to make it possible to mark this one as
properly aligned and get in a separate "read-mostly" cache-line.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use alternatives to select the workaround for the 11AP Pentium erratum
for the affected steppings on the fly rather than build time. Remove the
X86_GOOD_APIC configuration option and replace all the calls to
apic_write_around() with plain apic_write(), protecting accesses to the
ESR as appropriate due to the 3AP Pentium erratum. Remove
apic_read_around() and all its invocations altogether as not needed.
Remove apic_write_atomic() and all its implementing backends. The use of
ASM_OUTPUT2() is not strictly needed for input constraints, but I have
used it for readability's sake.
I had the feeling no one else was brave enough to do it, so I went ahead
and here it is. Verified by checking the generated assembly and tested
with both a 32-bit and a 64-bit configuration, also with the 11AP
"feature" forced on and verified with gdb on /proc/kcore to work as
expected (as an 11AP machines are quite hard to get hands on these days).
Some script complained about the use of "volatile", but apic_write() needs
it for the same reason and is effectively a replacement for writel(), so I
have disregarded it.
I am not sure what the policy wrt defconfig files is, they are generated
and there is risk of a conflict resulting from an unrelated change, so I
have left changes to them out. The option will get removed from them at
the next run.
Some testing with machines other than mine will be needed to avoid some
stupid mistake, but despite its volume, the change is not really that
intrusive, so I am fairly confident that because it works for me, it will
everywhere.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adrian Bunk reported that enabling 4MB page size breaks the build.
The problem is that MAX_ORDER combined with the page shift exceeds the
SECTION_SIZE_BITS we use in asm-sparc64/sparsemem.h
There are several ways I suppose we could work around this. For one
we could define a CONFIG_FORCE_MAX_ZONEORDER to decrease MAX_ORDER in
these higher page size cases.
But I also know that these page size cases are broken wrt. TLB miss
handling especially on pre-hypervisor systems, and there isn't an easy
way to fix that.
These options were meant to be fun experimental hacks anyways, and
only 8K and 64K make any sense to support.
So remove 512K and 4M base page size support. Of course, we still
support these page sizes for huge pages.
Signed-off-by: David S. Miller <davem@davemloft.net>
With this commit all sparc64 header files are moved to asm-sparc.
The remaining files (71 files) were too different to be trivially
merged so divide them up in a _32.h and a _64.h file which
are both included from the file with no bit size.
The following script were used:
cd include
FILES=`wc -l asm-sparc64/*h | grep -v '^ 1' | cut -b 20-`
for FILE in ${FILES}; do
echo $FILE:
BASE=`echo $FILE | cut -d '.' -f 1`
FN32=${BASE}_32.h
FN64=${BASE}_64.h
GUARD=___ASM_SPARC_`echo $BASE | tr '-' '_' | tr [:lower:] [:upper:]`_H
git mv asm-sparc/$FILE asm-sparc/$FN32
git mv asm-sparc64/$FILE asm-sparc/$FN64
echo git mv done
printf "#ifndef %s\n" $GUARD > asm-sparc/$FILE
printf "#define %s\n" $GUARD >> asm-sparc/$FILE
printf "#if defined(__sparc__) && defined(__arch64__)\n" >> asm-sparc/$FILE
printf "#include <asm-sparc/%s>\n" $FN64 >> asm-sparc/$FILE
printf "#else\n" >> asm-sparc/$FILE
printf "#include <asm-sparc/%s>\n" $FN32 >> asm-sparc/$FILE
printf "#endif\n" >> asm-sparc/$FILE
printf "#endif\n" >> asm-sparc/$FILE
git add asm-sparc/$FILE
echo new file done
printf "#include <asm-sparc/%s>\n" $FILE > asm-sparc64/$FILE
git add asm-sparc64/$FILE
echo sparc64 file done
done
The guard contains three '_' to avoid conflict with existing guards.
In additing the two Kbuild files are emptied to avoid breaking
headers_* targets.
We will reintroduce the exported header files when the necessary
kbuild changes are merged.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
A manual inspection revealed that the following headerfiles
contained only trivial differences:
hw_irq.h idprom.h kmap_types.h kvm.h spinlock_types.h sunbpp.h unaligned.h
The only noteworthy change are that sparc64 had a volatile
qualifer that sparc missed in spinlock_types.h.
In addition a few comments were updated.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Used the following script to find equal header files:
SPARC64=`ls asm-sparc64`
for FILE in ${SPARC64}; do
cmp -s asm-sparc/$FILE asm-sparc64/$FILE;
if [ $? = 0 ]; then
printf "#include <asm-sparc/%s>\n" $FILE > asm-sparc64/$FILE
fi
done
A few of the equal files are a simple include from
asm-generic, but by including the file from asm-sparc
we know they are equal for sparc and sparc64.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Used the following script to copy the files:
cd include
set -e
SPARC64=`ls asm-sparc64`
for FILE in ${SPARC64}; do
if [ -f asm-sparc/$FILE ]; then
echo $FILE exist in asm-sparc
else
git mv asm-sparc64/$FILE asm-sparc/$FILE
printf "#include <asm-sparc/$FILE>\n" > asm-sparc64/$FILE
git add asm-sparc64/$FILE
fi
done
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Joined the two files as they contain distinct definitions.
Inspired by patch from: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Adrian Bunk <bunk@kernel.org>
sparc64 exports openprom.h to userspace so let sparc follow
the example.
As openprom.h pulled in another not-for-export vaddrs.h header
file it required a few changes to fix the build.
The definition af VMALLOC_* were moved to pgtable as this is
where sparc64 has them.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Copy content of sparc64 file to sparc file.
There is only minimal possibilities for further unification.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Bring the commit e55c57e0b5
("[SPARC64]: Report any user access faults in termios accessors")
over to sparc when unifying the two files.
The diff was manually inspected to contain no
other relevant changes.
This unification therefore changes functionality of sparc.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
The type of tcflag_t differs from 32 and 64 bit.
For 32 bit it is long
For 64 bit it is int
Altough these have same size then I was not sure that
it was OK to change the 64 bit version to long as this
is part of the ABI so it was made conditional.
:$ diff -u include/asm-sparc/termbits.h include/asm-sparc64/termbits.h
:-- include/asm-sparc/termbits.h 2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/termbits.h 2008-06-13 06:42:07.000000000 +0200
:@@ -1,11 +1,11 @@
:-#ifndef _SPARC_TERMBITS_H
:-#define _SPARC_TERMBITS_H
:+#ifndef _SPARC64_TERMBITS_H
:+#define _SPARC64_TERMBITS_H
:
: #include <linux/posix_types.h>
:
: typedef unsigned char cc_t;
: typedef unsigned int speed_t;
:-typedef unsigned long tcflag_t;
:+typedef unsigned int tcflag_t;
:
: #define NCC 8
: struct termio {
:@@ -102,7 +102,7 @@
: #define IXANY 0x00000800
: #define IXOFF 0x00001000
: #define IMAXBEL 0x00002000
:-#define IUTF8 0x00004000
:+#define IUTF8 0x00004000
:
: /* c_oflag bits */
: #define OPOST 0x00000001
:@@ -171,7 +171,6 @@
: #define HUPCL 0x00000400
: #define CLOCAL 0x00000800
: #define CBAUDEX 0x00001000
:-/* We'll never see these speeds with the Zilogs, but for completeness... */
: #define BOTHER 0x00001000
: #define B57600 0x00001001
: #define B115200 0x00001002
:@@ -199,7 +198,7 @@
: #define B3500000 0x00001012
: #define B4000000 0x00001013 */
: #define CIBAUD 0x100f0000 /* input baud rate (not used) */
:-#define CMSPAR 0x40000000 /* mark or space (stick) parity */
:+#define CMSPAR 0x40000000 /* mark or space (stick) parity */
: #define CRTSCTS 0x80000000 /* flow control */
:
: #define IBSHIFT 16 /* Shift from CBAUD to CIBAUD */
:@@ -258,4 +257,4 @@
: #define TCSADRAIN 1
: #define TCSAFLUSH 2
:
:-#endif /* !(_SPARC_TERMBITS_H) */
:+#endif /* !(_SPARC64_TERMBITS_H) */
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
RLIM_INFINITY differ from 32 and 64 bit.
The rest is equal.
:$ diff -u include/asm-sparc/resource.h include/asm-sparc64/resource.h
:-- include/asm-sparc/resource.h 2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/resource.h 2008-06-13 06:46:39.000000000 +0200
:@@ -1,11 +1,11 @@
: /*
: * resource.h: Resource definitions.
: *
:- * Copyright (C) 1995 David S. Miller (davem@caip.rutgers.edu)
:+ * Copyright (C) 1996 David S. Miller (davem@caip.rutgers.edu)
: */
:
:-#ifndef _SPARC_RESOURCE_H
:-#define _SPARC_RESOURCE_H
:+#ifndef _SPARC64_RESOURCE_H
:+#define _SPARC64_RESOURCE_H
:
: /*
: * These two resource limit IDs have a Sparc/Linux-specific ordering,
:@@ -14,13 +14,6 @@
: #define RLIMIT_NOFILE 6 /* max number of open files */
: #define RLIMIT_NPROC 7 /* max number of processes */
:
:-/*
:- * SuS says limits have to be unsigned.
:- * We make this unsigned, but keep the
:- * old value for compatibility:
:- */
:-#define RLIM_INFINITY 0x7fffffff
:-
: #include <asm-generic/resource.h>
:
:-#endif /* !(_SPARC_RESOURCE_H) */
:+#endif /* !(_SPARC64_RESOURCE_H) */
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
There were only a few trivial changes and a few additions
in the sparc64 variant of this file.
This patch copies the sparc64 specific bits to the sparc version
of fbio.h so they are equal. A later patch will merge the two.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Padding in the sembuf structure made conditional
as only 32 bit sparc did so.
:$ diff -u include/asm-sparc/sembuf.h include/asm-sparc64/sembuf.h
:-- include/asm-sparc/sembuf.h 2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/sembuf.h 2008-06-13 06:42:07.000000000 +0200
:@@ -1,21 +1,18 @@
:-#ifndef _SPARC_SEMBUF_H
:-#define _SPARC_SEMBUF_H
:+#ifndef _SPARC64_SEMBUF_H
:+#define _SPARC64_SEMBUF_H
:
: /*
:- * The semid64_ds structure for sparc architecture.
:+ * The semid64_ds structure for sparc64 architecture.
: * Note extra padding because this structure is passed back and forth
: * between kernel and user space.
: *
: * Pad space is left for:
:- * - 64-bit time_t to solve y2038 problem
:- * - 2 miscellaneous 32-bit values
:+ * - 2 miscellaneous 64-bit values
: */
:
: struct semid64_ds {
: struct ipc64_perm sem_perm; /* permissions .. see ipc.h */
:- unsigned int __pad1;
: __kernel_time_t sem_otime; /* last semop time */
:- unsigned int __pad2;
: __kernel_time_t sem_ctime; /* last change time */
: unsigned long sem_nsems; /* no. of semaphores in array */
: unsigned long __unused1;
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Padding from 32 bit sparc kept using preprocessor magic
:$ diff -u include/asm-sparc/msgbuf.h include/asm-sparc64/msgbuf.h
:-- include/asm-sparc/msgbuf.h 2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/msgbuf.h 2008-06-13 06:42:07.000000000 +0200
:@@ -7,17 +7,13 @@
: * between kernel and user space.
: *
: * Pad space is left for:
:- * - 64-bit time_t to solve y2038 problem
:- * - 2 miscellaneous 32-bit values
:+ * - 2 miscellaneous 64-bit values
: */
:
: struct msqid64_ds {
: struct ipc64_perm msg_perm;
:- unsigned int __pad1;
: __kernel_time_t msg_stime; /* last msgsnd time */
:- unsigned int __pad2;
: __kernel_time_t msg_rtime; /* last msgrcv time */
:- unsigned int __pad3;
: __kernel_time_t msg_ctime; /* last change time */
: unsigned long msg_cbytes; /* current number of bytes on queue */
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Trivial differenses in comments - used the version from sparc64
:$ diff -u include/asm-sparc/ioctls.h include/asm-sparc64/ioctls.h
:-- include/asm-sparc/ioctls.h 2008-06-13 08:46:29.000000000 +0200
:++ include/asm-sparc64/ioctls.h 2008-06-13 08:46:29.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef _ASM_SPARC_IOCTLS_H
:-#define _ASM_SPARC_IOCTLS_H
:+#ifndef _ASM_SPARC64_IOCTLS_H
:+#define _ASM_SPARC64_IOCTLS_H
:
: #include <asm/ioctl.h>
:
:@@ -22,7 +22,7 @@
:
: /* Note that all the ioctls that are not available in Linux have a
: * double underscore on the front to: a) avoid some programs to
:- * thing we support some ioctls under Linux (autoconfiguration stuff)
:+ * think we support some ioctls under Linux (autoconfiguration stuff)
: */
: /* Little t */
: #define TIOCGETD _IOR('t', 0, int)
:@@ -110,7 +110,7 @@
: #define TIOCSERGETLSR 0x5459 /* Get line status register */
: #define TIOCSERGETMULTI 0x545A /* Get multiport config */
: #define TIOCSERSETMULTI 0x545B /* Set multiport config */
:-#define TIOCMIWAIT 0x545C /* Wait input */
:+#define TIOCMIWAIT 0x545C /* Wait for change on serial input line(s) */
: #define TIOCGICOUNT 0x545D /* Read serial port inline interrupt counts */
:
: /* Kernel definitions */
:@@ -133,4 +133,4 @@
: #define TIOCPKT_NOSTOP 16
: #define TIOCPKT_DOSTOP 32
:
:-#endif /* !(_ASM_SPARC_IOCTLS_H) */
:+#endif /* !(_ASM_SPARC64_IOCTLS_H) */
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Copy was done using the following simple script:
set -e
SPARC64="h display7seg.h envctrl.h psrcompat.h pstate.h uctx.h utrap.h watchdog.h"
for FILE in ${SPARC64}; do
if [ -f asm-sparc/$FILE ]; then
echo $FILE exist in asm-sparc
fi
cat asm-sparc64/$FILE > asm-sparc/$FILE
printf "#include <asm-sparc/$FILE>\n" > asm-sparc64/$FILE
done
The name of the copied files are added to asm-sparc/Kbuild
to keep "make headers_check" functional.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
This seems to be left from the long gone AP1000 support.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have to have exclusive access to the given qdisc anyways, so
doing even more locking is superfluous.
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sch_tree_lock() lock the qdisc's root. All of the
users hold the RTNL semaphore and the root qdisc is not
changing.
Implement tbf_tree_{lock,unlock}() simply in terms of
sch_tree_{lock,unlock}().
Signed-off-by: David S. Miller <davem@davemloft.net>
When we have shared qdiscs, packets come out of the qdiscs
for multiple transmit queues.
Therefore it doesn't make any sense to schedule the transmit
queue when logically we cannot know ahead of time the TX
queue of the SKB that the qdisc->dequeue() will give us.
Just for sanity I added a BUG check to make sure we never
get into a state where the noop_qdisc is scheduled.
Signed-off-by: David S. Miller <davem@davemloft.net>
When code wants to lock the qdisc tree state, the logic
operation it's doing is locking the top-level qdisc that
sits of the root of the netdev_queue.
Add qdisc_root_lock() to represent this and convert the
easiest cases.
In order for this to work out in all cases, we have to
hook up the noop_qdisc to a dummy netdev_queue.
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently it is associated with a netdev_queue, but when we have
qdisc sharing that no longer makes any sense.
Signed-off-by: David S. Miller <davem@davemloft.net>
We liberate any dangling gso_skb during qdisc destruction.
It really only matters for the root qdisc. But when qdiscs
can be shared by multiple netdev_queue objects, we can't
have the gso_skb in the netdev_queue any more.
Signed-off-by: David S. Miller <davem@davemloft.net>
The only behavior change is that we do not drop packets under any
circumstances. If that is absolutely needed, we could easily add it
back.
With cleanups and help from Johannes Berg.
Signed-off-by: David S. Miller <davem@davemloft.net>
Devices or device layers can set this to control the queue selection
performed by dev_pick_tx().
This function runs under RCU protection, which allows overriding
functions to have some way of synchronizing with things like dynamic
->real_num_tx_queues adjustments.
This makes the spinlock prefetch in dev_queue_xmit() a little bit
less effective, but that's the price right now for correctness.
Signed-off-by: David S. Miller <davem@davemloft.net>
The private area of a netdev is now at a fixed offset once more.
Unfortunately, some assumptions that netdev_priv() == netdev->priv
crept back into the tree. In particular this happened in the
loopback driver. Make it use netdev->ml_priv.
Signed-off-by: David S. Miller <davem@davemloft.net>
This effectively "flips the switch" by making the core networking
and multiqueue-aware drivers use the new TX multiqueue structures.
Non-multiqueue drivers need no changes. The interfaces they use such
as netif_stop_queue() degenerate into an operation on TX queue zero.
So everything "just works" for them.
Code that really wants to do "X" to all TX queues now invokes a
routine that does so, such as netif_tx_wake_all_queues(),
netif_tx_stop_all_queues(), etc.
pktgen and netpoll required a little bit more surgery than the others.
In particular the pktgen changes, whilst functional, could be largely
improved. The initial check in pktgen_xmit() will sometimes check the
wrong queue, which is mostly harmless. The thing to do is probably to
invoke fill_packet() earlier.
The bulk of the netpoll changes is to make the code operate solely on
the TX queue indicated by by the SKB queue mapping.
Setting of the SKB queue mapping is entirely confined inside of
net/core/dev.c:dev_pick_tx(). If we end up needing any kind of
special semantics (drops, for example) it will be implemented here.
Finally, we now have a "real_num_tx_queues" which is where the driver
indicates how many TX queues are actually active.
With IGB changes from Jeff Kirsher.
Signed-off-by: David S. Miller <davem@davemloft.net>
This actually fixes a bug added by the RR scheduler changes. The
->bands and ->prio2band parameters were being set outside of the
sch_tree_lock() and thus could result in strange behavior and
inconsistencies.
It might be possible, in the new design (where there will be one qdisc
per device TX queue) to allow similar functionality via a TX hash
algorithm for RR but I really see no reason to export this aspect of
how these multiqueue cards actually implement the scheduling of the
the individual DMA TX rings and the single physical MAC/PHY port.
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no need for a feature bit for something that
can be tested by simply checking the TX queue count.
Signed-off-by: David S. Miller <davem@davemloft.net>
alloc_netdev_mq() now allocates an array of netdev_queue
structures for TX, based upon the queue_count argument.
Furthermore, all accesses to the TX queues are now vectored
through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
interfaces. This makes it easy to grep the tree for all
things that want to get to a TX queue of a net device.
Problem spots which are not really multiqueue aware yet, and
only work with one queue, can easily be spotted by grepping
for all netdev_get_tx_queue() calls that pass in a zero index.
Signed-off-by: David S. Miller <davem@davemloft.net>
All callers of async_tx_sync_epilog have called async_tx_quiesce on the
depend_tx, so async_tx_sync_epilog need only call the callback to
complete the operation.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The configfs operations ->make_item() and ->make_group() currently
return a new item/group. A return of NULL signifies an error. Because
of this, -ENOMEM is the only return code bubbled up the stack.
Multiple folks have requested the ability to return specific error codes
when these operations fail. This patch adds that ability by changing the
->make_item/group() ops to return ERR_PTR() values. These errors are
bubbled up appropriately. NULL returns are changed to -ENOMEM for
compatibility.
Also updated are the in-kernel users of configfs.
This is a rework of reverted commit 11c3b79218.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Merge the GDT_ENTRY() macro between arch/x86/boot/pm.c and
arch/x86/kernel/acpi/sleep.c and put the new one in
<asm-x86/segment.h>.
While we're at it, correct the bitmasks for the limit and flags. The
new version relies on using ULL constants in order to cause type
promotion rather than explicit casts; this avoids having to include
<linux/types.h> in <asm-x86/segments.h>.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
[PATCH] ocfs2: fix oops in mmap_truncate testing
configfs: call drop_link() to cleanup after create_link() failure
configfs: Allow ->make_item() and ->make_group() to return detailed errors.
configfs: Fix failing mkdir() making racing rmdir() fail
configfs: Fix deadlock with racing rmdir() and rename()
configfs: Make configfs_new_dirent() return error code instead of NULL
configfs: Protect configfs_dirent s_links list mutations
configfs: Introduce configfs_dirent_lock
ocfs2: Don't snprintf() without a format.
ocfs2: Fix CONFIG_OCFS2_DEBUG_FS #ifdefs
ocfs2/net: Silence build warnings on sparc64
ocfs2: Handle error during journal load
ocfs2: Silence an error message in ocfs2_file_aio_read()
ocfs2: use simple_read_from_buffer()
ocfs2: fix printk format warnings with OCFS2_FS_STATS=n
[PATCH 2/2] ocfs2: Instrument fs cluster locks
[PATCH 1/2] ocfs2: Add CONFIG_OCFS2_FS_STATS config option
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix asm/e820.h for userspace inclusion
x86: fix numaq_tsc_disable
x86: fix kernel_physical_mapping_init() for large x86 systems
asm-x86/e820.h is included from userspace. 'x86: make e820.c to have
common functions' (b79cd8f126) broke it:
make -C Documentation/lguest
cc -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include
lguest.c -lz -o lguest
In file included from ../../include/asm-x86/bootparam.h:8,
from lguest.c:45:
../../include/asm/e820.h:66: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:67: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:68: error: expected ‘)’ before ‘start’
../../include/asm/e820.h:72: error: expected ‘=’, ‘,’, ‘;’, ‘asm’
or ‘__attribute__’ before ‘e820_update_range’
...
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace:
fix dangling zombie when new parent ignores children
do_wait: return security_task_wait() error code in place of -ECHILD
ptrace children revamp
do_wait reorganization
in __neigh_event_send, if we have a neighbour entry which is in
NUD_INCOMPLETE state, we enqueue any outbound frames to that neighbour
to the neighbours arp_queue, which is default capped to a length of 3
skbs. If that queue exceeds its set length, it will drop an skb on
the queue to enqueue the newly arrived skb. This results in a drop
for which we have no statistics incremented. This patch adds an
unresolved_discards stat to /proc/net/stat/ndisc_cache to track these
lost frames.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Done with NET_XXX_STATS macros :)
To be continued...
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This one is tricky.
The thing is that this macro is only used when killing tw buckets,
but since this killer is promiscuous wrt to which net each particular
tw belongs to, I have to use it only when NET_NS is off. When the net
namespaces are on, I use the INET_INC_STATS_BH for each bucket.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tcp_enter_memory_pressure calls NET_INC_STATS, but doesn't
have where to get the net from.
I decided to add a sk argument, not the net itself, only to factor
all the required sock_net(sk) calls inside the enter_memory_pressure
callback itself.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Same as before - the sock is always there to get the net from,
but there are also some places with the net already saved on
the stack.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fortunately (almost) all the TCP code has a sock to get the net from :)
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This one sets TCP MIBs after zeroing them, and thus requires
the net.
The existing single caller can use init_net (temporarily).
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP_INC_STATS_USER and TCP_ADD_STATS_BH are currently unused.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Very simple - only ip_evictor (fragments) requires such.
This patch ends up the IP_XXX_STATS patching.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
All the callers already have either the net itself, or the place
where to get it from.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
ptrace no longer fiddles with the children/sibling links, and the
old ptrace_children list is gone. Now ptrace, whether of one's own
children or another's via PTRACE_ATTACH, just uses the new ptraced
list instead.
There should be no user-visible difference that matters. The only
change is the order in which do_wait() sees multiple stopped
children and stopped ptrace attachees. Since wait_task_stopped()
was changed earlier so it no longer reorders the children list, we
already know this won't cause any new problems.
Signed-off-by: Roland McGrath <roland@redhat.com>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits)
Revert "x86/PCI: ACPI based PCI gap calculation"
PCI: remove unnecessary volatile in PCIe hotplug struct controller
x86/PCI: ACPI based PCI gap calculation
PCI: include linux/pm_wakeup.h for device_set_wakeup_capable
PCI PM: Fix pci_prepare_to_sleep
x86/PCI: Fix PCI config space for domains > 0
Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n
PCI: Simplify PCI device PM code
PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep
PCI ACPI: Rework PCI handling of wake-up
ACPI: Introduce new device wakeup flag 'prepared'
ACPI: Introduce acpi_device_sleep_wake function
PCI: rework pci_set_power_state function to call platform first
PCI: Introduce platform_pci_power_manageable function
ACPI: Introduce acpi_bus_power_manageable function
PCI: make pci_name use dev_name
PCI: handle pci_name() being const
PCI: add stub for pci_set_consistent_dma_mask()
PCI: remove unused arch pcibios_update_resource() functions
PCI: fix pci_setup_device()'s sprinting into a const buffer
...
Fixed up conflicts in various files (arch/x86/kernel/setup_64.c,
arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c,
drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86
and ACPI updates manually.
This converts the FSL Book-E PTE access and TLB miss handling to match
with the recent changes to 44x that introduce support for non-atomic PTE
operations in pgtable-ppc32.h and removes write back to the PTE from
the TLB miss handlers. In addition, the DSI interrupt code no longer
tries to fixup write permission, this is left to generic code, and
_PAGE_HWWRITE is gone.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Now that arch/ppc is gone we always define CONFIG_PPC_CPM_NEW_BINDING so
we can remove all the code associated with !CONFIG_PPC_CPM_NEW_BINDING.
Also fixed some asm/of_platform.h to linux/of_platform.h (and of_device.h)
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Mostly having to do with not marking things __iomem. And some failure
to use appropriate accessors to read MMIO regs.
Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Basic PM support for 83xx. Standby is implemented as sleep.
Suspend-to-RAM is implemented as "deep sleep" (with the processor
turned off) on 831x.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6:
UBIFS: include to compilation
UBIFS: add new flash file system
UBIFS: add brief documentation
MAINTAINERS: add UBIFS section
do_mounts: allow UBI root device name
VFS: export sync_sb_inodes
VFS: move inode_lock into sync_sb_inodes
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
IDE: Report errors during drive reset back to user space
Update documentation of HDIO_DRIVE_RESET ioctl
IDE: Remove unused code
IDE: Fix HDIO_DRIVE_RESET handling
hd.c: remove the #include <linux/mc146818rtc.h>
update the BLK_DEV_HD help text
move ide/legacy/hd.c to drivers/block/
ide/legacy/hd.c: use late_initcall()
remove BLK_DEV_HD_ONLY
ide: endian annotations in ide-floppy.c
ide-floppy: zero out the whole struct ide_atapi_pc on init
ide-floppy: fold idefloppy_create_test_unit_ready_cmd into idefloppy_open
ide-cd: move request prep chunk from cdrom_do_newpc_cont to rq issue path
ide-cd: move request prep from cdrom_start_rw_cont to rq issue path
ide-cd: move request prep from cdrom_start_seek_continuation to rq issue path
ide-cd: fold cdrom_start_seek into ide_cd_do_request
ide-cd: simplify request issuing path
ide-cd: mv ide_do_rw_cdrom ide_cd_do_request
ide-cd: cdrom_start_seek: remove unused argument block
ide-cd: ide_do_rw_cdrom: add the catch-all bad request case to the if-else block
...
* 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6: (87 commits)
Fix FADT parsing
Add the ability to reset the machine using the RESET_REG in ACPI's FADT table.
ACPI: use dev_printk when possible
PNPACPI: add support for HP vendor-specific CCSR descriptors
PNP: avoid legacy IDE IRQs
PNP: convert resource options to single linked list
ISAPNP: handle independent options following dependent ones
PNP: remove extra 0x100 bit from option priority
PNP: support optional IRQ resources
PNP: rename pnp_register_*_resource() local variables
PNPACPI: ignore _PRS interrupt numbers larger than PNP_IRQ_NR
PNP: centralize resource option allocations
PNP: remove redundant pnp_can_configure() check
PNP: make resource assignment functions return 0 (success) or -EBUSY (failure)
PNP: in debug resource dump, make empty list obvious
PNP: improve resource assignment debug
PNP: increase I/O port & memory option address sizes
PNP: introduce pnp_irq_mask_t typedef
PNP: make resource option structures private to PNP subsystem
PNP: define PNP-specific IORESOURCE_IO_* flags alongside IRQ, DMA, MEM
...
Fail integrity check gracefully when request does not have a bio
attached (BLOCK_PC).
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (82 commits)
NFSv4: Remove BKL from the nfsv4 state recovery
SUNRPC: Remove the BKL from the callback functions
NFS: Remove BKL from the readdir code
NFS: Remove BKL from the symlink code
NFS: Remove BKL from the sillydelete operations
NFS: Remove the BKL from the rename, rmdir and unlink operations
NFS: Remove BKL from NFS lookup code
NFS: Remove the BKL from nfs_link()
NFS: Remove the BKL from the inode creation operations
NFS: Remove BKL usage from open()
NFS: Remove BKL usage from the write path
NFS: Remove the BKL from the permission checking code
NFS: Remove attribute update related BKL references
NFS: Remove BKL requirement from attribute updates
NFS: Protect inode->i_nlink updates using inode->i_lock
nfs: set correct fl_len in nlmclnt_test()
SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon
SUNRPC: Refactor rpcb_register to make rpcbindv4 support easier
SUNRPC: None of rpcb_create's callers wants a privileged source port
SUNRPC: Introduce a specific rpcb_create for contacting localhost
...
ISAPNP, PNPBIOS, and ACPI describe the "possible resource settings" of
a device, i.e., the possibilities an OS bus driver has when it assigns
I/O port, MMIO, and other resources to the device.
PNP used to maintain this "possible resource setting" information in
one independent option structure and a list of dependent option
structures for each device. Each of these option structures had lists
of I/O, memory, IRQ, and DMA resources, for example:
dev
independent options
ind-io0 -> ind-io1 ...
ind-mem0 -> ind-mem1 ...
...
dependent option set 0
dep0-io0 -> dep0-io1 ...
dep0-mem0 -> dep0-mem1 ...
...
dependent option set 1
dep1-io0 -> dep1-io1 ...
dep1-mem0 -> dep1-mem1 ...
...
...
This data structure was designed for ISAPNP, where the OS configures
device resource settings by writing directly to configuration
registers. The OS can write the registers in arbitrary order much
like it writes PCI BARs.
However, for PNPBIOS and ACPI devices, the OS uses firmware interfaces
that perform device configuration, and it is important to pass the
desired settings to those interfaces in the correct order. The OS
learns the correct order by using firmware interfaces that return the
"current resource settings" and "possible resource settings," but the
option structures above doesn't store the ordering information.
This patch replaces the independent and dependent lists with a single
list of options. For example, a device might have possible resource
settings like this:
dev
options
ind-io0 -> dep0-io0 -> dep1->io0 -> ind-io1 ...
All the possible settings are in the same list, in the order they
come from the firmware "possible resource settings" list. Each entry
is tagged with an independent/dependent flag. Dependent entries also
have a "set number" and an optional priority value. All dependent
entries must be assigned from the same set. For example, the OS can
use all the entries from dependent set 0, or all the entries from
dependent set 1, but it cannot mix entries from set 0 with entries
from set 1.
Prior to this patch PNP didn't keep track of the order of this list,
and it assigned all independent options first, then all dependent
ones. Using the example above, that resulted in a "desired
configuration" list like this:
ind->io0 -> ind->io1 -> depN-io0 ...
instead of the list the firmware expects, which looks like this:
ind->io0 -> depN-io0 -> ind-io1 ...
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This patch adds an IORESOURCE_IRQ_OPTIONAL flag for use when
assigning resources to a device. If the flag is set and we are
unable to assign an IRQ to the device, we can leave the IRQ
disabled but allow the overall resource allocation to succeed.
Some devices request an IRQ, but can run without an IRQ
(possibly with degraded performance). This flag lets us run
the device without the IRQ instead of just leaving the
device disabled.
This is a reimplementation of this previous change by Rene
Herman <rene.herman@gmail.com>:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3b73a223661ed137c5d3d2635f954382e94f5a43
I reimplemented this for two reasons:
- to prepare for converting all resource options into a single linked
list, as opposed to the per-resource-type lists we have now, and
- to preserve the order and number of resource options.
In PNPBIOS and ACPI, we configure a device by giving firmware a
list of resource assignments. It is important that this list
has exactly the same number of resources, in the same order,
as the "template" list we got from the firmware in the first
place.
The problem of a sound card MPU401 being left disabled for want of
an IRQ was reported by Uwe Bugla <uwe.bugla@gmx.de>.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Nothing outside the PNP subsystem should need access to a
device's resource options, so this patch moves the option
structure declarations to a private header file.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
PNP previously defined PNP_PORT_FLAG_16BITADDR and PNP_PORT_FLAG_FIXED
in a private header file, but put those flags in struct resource.flags
fields. Better to make them IORESOURCE_IO_* flags like the existing
IRQ, DMA, and MEM flags.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
As part of a heuristic to identify modem devices, 8250_pnp.c
checks to see whether a device can be configured at any of the
legacy COM port addresses.
This patch moves the code that traverses the PNP "possible resource
options" from 8250_pnp.c to the PNP subsystem. This encapsulation
is important because a future patch will change the implementation
of those resource options.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
PNP used to have a fixed-size pnp_resource_table for tracking the
resources used by a device. This table often overflowed, so we've
had to increase the table size, which wastes memory because most
devices have very few resources.
This patch replaces the table with a linked list of resources where
the entries are allocated on demand.
This removes messages like these:
pnpacpi: exceeded the max number of IO resources
00:01: too many I/O port resources
References:
http://bugzilla.kernel.org/show_bug.cgi?id=9535http://bugzilla.kernel.org/show_bug.cgi?id=9740http://lkml.org/lkml/2007/11/30/110
This patch also changes the way PNP uses the IORESOURCE_UNSET,
IORESOURCE_AUTO, and IORESOURCE_DISABLED flags.
Prior to this patch, the pnp_resource_table entries used the flags
like this:
IORESOURCE_UNSET
This table entry is unused and available for use. When this flag
is set, we shouldn't look at anything else in the resource structure.
This flag is set when a resource table entry is initialized.
IORESOURCE_AUTO
This resource was assigned automatically by pnp_assign_{io,mem,etc}().
This flag is set when a resource table entry is initialized and
cleared whenever we discover a resource setting by reading an ISAPNP
config register, parsing a PNPBIOS resource data stream, parsing an
ACPI _CRS list, or interpreting a sysfs "set" command.
Resources marked IORESOURCE_AUTO are reinitialized and marked as
IORESOURCE_UNSET by pnp_clean_resource_table() in these cases:
- before we attempt to assign resources automatically,
- if we fail to assign resources automatically,
- after disabling a device
IORESOURCE_DISABLED
Set by pnp_assign_{io,mem,etc}() when automatic assignment fails.
Also set by PNPBIOS and PNPACPI for:
- invalid IRQs or GSI registration failures
- invalid DMA channels
- I/O ports above 0x10000
- mem ranges with negative length
After this patch, there is no pnp_resource_table, and the resource list
entries use the flags like this:
IORESOURCE_UNSET
This flag is no longer used in PNP. Instead of keeping
IORESOURCE_UNSET entries in the resource list, we remove
entries from the list and free them.
IORESOURCE_AUTO
No change in meaning: it still means the resource was assigned
automatically by pnp_assign_{port,mem,etc}(), but these functions
now set the bit explicitly.
We still "clean" a device's resource list in the same places,
but rather than reinitializing IORESOURCE_AUTO entries, we
just remove them from the list.
Note that IORESOURCE_AUTO entries are always at the end of the
list, so removing them doesn't reorder other list entries.
This is because non-IORESOURCE_AUTO entries are added by the
ISAPNP, PNPBIOS, or PNPACPI "get resources" methods and by the
sysfs "set" command. In each of these cases, we completely free
the resource list first.
IORESOURCE_DISABLED
In addition to the cases where we used to set this flag, ISAPNP now
adds an IORESOURCE_DISABLED resource when it reads a configuration
register with a "disabled" value.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Some callers use pnp_port_start() and similar functions without
making sure the resource is valid. This patch makes us fall
back to returning the initial values if the resource is not
valid or not even present.
This mostly preserves the previous behavior, where we would just
return the initial values set by pnp_init_resource_table(). The
original 2.6.25 code didn't range-check the "bar", so it would
return garbage if the bar exceeded the table size. This code
returns sensible values instead.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
"idle=nomwait" disables the use of the MWAIT
instruction from both C1 (C1_FFH) and deeper (C2C3_FFH)
C-states.
When MWAIT is unavailable, the BIOS and OS generally
negotiate to use the HALT instruction for C1,
and use IO accesses for deeper C-states.
This option is useful for power and performance
comparisons, and also to work around BIOS bugs
where broken MWAIT support is advertised.
http://bugzilla.kernel.org/show_bug.cgi?id=10807http://bugzilla.kernel.org/show_bug.cgi?id=10914
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
"idle=halt" limits the idle loop to using
the halt instruction. No MWAIT, no IO accesses,
no C-states deeper than C1.
If something is broken in the idle code,
"idle=halt" is a less severe workaround
than "idle=poll" which disables all power savings.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Allow users to enable/disable/clear a specific & valid GPE/Fixed Event
in user space.
This is useful for debugging, especially for some
interrupt storm issues.
All wakeup GPEs are disabled and they can not be enabled at runtime,
and we mark them as invalid.
All GPEs that don't have a _Lxx/_Exx method are marked as invalid.
All Fixed Events that don't have an event handler are marked as invalid
and they can't be enabled until an event handler is registered.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Ling Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Eliminated unnecessary operands; eliminated use of negative index
in loop. Operands now displayed in correct order, not backwards.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Now supports the 2007 intel Virtualization Technology for Directed
I/O specification.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Synchronized tables with current specifications.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Update version to 20080514
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Mostly MODULE_NAME and printf format strings.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
No longer needed; replaced mostly with u32, but also acpi_size
where a type that changes 32/64 bit on 32/64-bit platforms is
required.
v2: Fix a cast of a 32-bit int to a pointer in ACPI to avoid a compiler warning.
from David Howells
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Added NULL fields to the exception string arrays to eliminate
the -1 subtraction on the SubStatus field.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Changed ACPI_MODULE_NAME and ACPI_FUNCTION_NAME to use arrays of
strings instead of pointers to static strings. Jan Beulich and
Bob Moore.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Fixes problem where the new method argument count validation mechanism
will enter an infinite loop when a GPE method is dispatched.
Problem fixed be removing the obsolete code that passes GPE block
information to the notify handler via the control method parameter pointer.
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Error if too few arguments, warning if too many. This applies
only to external programmatic control method execution, not
method-to-method calls within the AML.
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
The freezer currently attempts to distinguish kernel threads from
user space tasks by checking if their mm pointer is unset and it
does not send fake signals to kernel threads. However, there are
kernel threads, mostly related to networking, that behave like
user space tasks and may want to be sent a fake signal to be frozen.
Introduce the new process flag PF_FREEZER_NOSIG that will be set
by default for all kernel threads and make the freezer only send
fake signals to the tasks having PF_FREEZER_NOSIG unset. Provide
the set_freezable_with_signal() function to be called by the kernel
threads that want to be sent a fake signal for freezing.
This patch should not change the freezer's observable behavior.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
Get rid of a superfluous acpi_pm_device_sleep_state() parameter. The
only legitimate value of that parameter must be derived from the first
parameter, which is what all the callers already do. (However, this
does not address the fact that ACPI still doesn't set up those flags.)
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
Reorder the mutex names to match the preceding #defines
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Implemented another change for the GPE disable. We now perform a
read-change-write of the enable register instead of simply writing out the
cached enable mask. This will prevent inadvertent enabling of GPEs if a rogue
GPE is received during initialization (before GPE handlers are installed.)
http://bugzilla.kernel.org/show_bug.cgi?id=6217
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Change processors from an array sized by NR_CPUS to a per_cpu variable.
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
This closes some arcane holes in single-step handling that can arise
only when user programs set TF directly (via popf or sigreturn) and
then use vDSO (syscall/sysenter) system call entry. In those entry
paths, the clear_TF_reenable case hits and we must check TIF_SINGLESTEP
to be sure our bookkeeping stays correct wrt the user's view of TF.
Signed-off-by: Roland McGrath <roland@redhat.com>
This unifies and cleans up the syscall tracing code on i386 and x86_64.
Using a single function for entry and exit tracing on 32-bit made the
do_syscall_trace() into some terrible spaghetti. The logic is clear and
simple using separate syscall_trace_enter() and syscall_trace_leave()
functions as on 64-bit.
The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
tracing either 32-bit or 64-bit tasks. It behaves just like 32-bit.
Changing syscall_trace_enter() to return the syscall number shortens
all the assembly paths, while adding the SYSEMU feature in a simple way.
Signed-off-by: Roland McGrath <roland@redhat.com>
This unifies the treatment of TIF_SINGLESTEP on i386 and x86_64.
The bit is now excluded from _TIF_WORK_MASK on i386 as it has been
on x86_64. This means the do_notify_resume() path using it is never
used, so TIF_SINGLESTEP is not cleared on returning to user mode.
Both now leave TIF_SINGLESTEP set when returning to user, so that
it's already set on an int $0x80 system call entry. This removes
the need for testing TF on the system_call path. Doing it this way
fixes the regression for PTRACE_SINGLESTEP into a sigreturn syscall,
introduced by commit 1e2e99f0e4.
The clear_TF_reenable case that sets TIF_SINGLESTEP can only happen
on a non-exception kernel entry, i.e. sysenter/syscall instruction.
That will always get to the syscall exit tracing path.
Signed-off-by: Roland McGrath <roland@redhat.com>
Remove some code which has been made obsolete and hasn't worked properly
before anyway. Part of the infrastructure may be reintroduced in a
follow up patch to implement a working command aborting facility.
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
Cc: "Randy Dunlap" <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Currently, the code path executing an HDIO_DRIVE_RESET ioctl is broken
in various ways. Most importantly, it is treated as an out of band
request in an illegal way which may very likely lead to system lock ups.
Use the drive's request queue to avoid this problem (and fix a locking
issue for free along the way).
Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
Cc: "Randy Dunlap" <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Change ->port_init_devs method to take 'ide_drive_t *' as an argument
instead of 'ide_hwif_t *' and rename it to ->init_dev.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add 'parent' field to hw_regs_t for optional parent device pointer (needed
by macio PMAC IDE controllers) and set hwif->dev in ide_init_port_hw().
* Update au1xxx-ide.c, sgiioc4.c, pmac.c and setup-pci.c accordingly.
v2:
* Update scc_pata.c.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Move PIO blacklist to ide-pio-blacklist.c.
While at it:
- fix comment
- fix whitespace damage
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
All ide_pio_cycle_time() users already select CONFIG_IDE_TIMINGS
so move the function from ide-lib.c to ide-timings.c.
While at it:
- convert ide_pio_cycle_time() to use ide_timing_find_mode()
- cleanup ide_pio_cycle_time() a bit
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Don't include ide-timing.h in cs5535 and sis5513 host drivers
(they don't need it currently).
* Convert ide-timing.h to ide-timings.c library and add CONFIG_IDE_TIMINGS
config option to be selected by host drivers using the library.
While at it:
- fix ide_timing_find_mode() placement
v2:
* Add missing EXPORT_SYMBOLs. (Stephen Rothwell <sfr@canb.auug.org.au>)
There should be no functional changes caused by this patch.
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Move struct ide_timing and IDE_TIMING_* defines to <linux/ide.h>
from drivers/ide/ide-timing.h.
While at it:
- use u8/u16 instead of short for struct ide_timing fields
- use enum for IDE_TIMING_*
There should be no functional changes caused by this patch.
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
The standard ticket spinlocks are very expensive in a virtual
environment, because their performance depends on Xen's scheduler
giving vcpus time in the order that they're supposed to take the
spinlock.
This implements a Xen-specific spinlock, which should be much more
efficient.
The fast-path is essentially the old Linux-x86 locks, using a single
lock byte. The locker decrements the byte; if the result is 0, then
they have the lock. If the lock is negative, then locker must spin
until the lock is positive again.
When there's contention, the locker spin for 2^16[*] iterations waiting
to get the lock. If it fails to get the lock in that time, it adds
itself to the contention count in the lock and blocks on a per-cpu
event channel.
When unlocking the spinlock, the locker looks to see if there's anyone
blocked waiting for the lock by checking for a non-zero waiter count.
If there's a waiter, it traverses the per-cpu "lock_spinners"
variable, which contains which lock each CPU is waiting on. It picks
one CPU waiting on the lock and sends it an event to wake it up.
This allows efficient fast-path spinlock operation, while allowing
spinning vcpus to give up their processor time while waiting for a
contended lock.
[*] 2^16 iterations is threshold at which 98% locks have been taken
according to Thomas Friebel's Xen Summit talk "Preventing Guests from
Spinning Around". Therefore, we'd expect the lock and unlock slow
paths will only be entered 2% of the time.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Implement a version of the old spinlock algorithm, in which everyone
spins waiting for a lock byte. In order to be compatible with the
ticket-lock's use of a zero initializer, this uses the convention of
'0' for unlocked and '1' for locked.
This algorithm is much better than ticket locks in a virtual
envionment, because it doesn't interact badly with the vcpu scheduler.
If there are multiple vcpus spinning on a lock and the lock is
released, the next vcpu to be scheduled will take the lock, rather
than cycling around until the next ticketed vcpu gets it.
To use this, you must call paravirt_use_bytelocks() very early, before
any spinlocks have been taken.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ticket spinlocks have absolutely ghastly worst-case performance
characteristics in a virtual environment. If there is any contention
for physical CPUs (ie, there are more runnable vcpus than cpus), then
ticket locks can cause the system to end up spending 90+% of its time
spinning.
The problem is that (v)cpus waiting on a ticket spinlock will be
granted access to the lock in strict order they got their tickets. If
the hypervisor scheduler doesn't give the vcpus time in that order,
they will burn timeslices waiting for the scheduler to give the right
vcpu some time. In the worst case it could take O(n^2) vcpu scheduler
timeslices for everyone waiting on the lock to get it, not counting
new cpus trying to take the lock while the log-jam is sorted out.
These hooks allow a paravirt backend to replace the spinlock
implementation.
At the very least, this could revert the implementation back to the
old lock algorithm, which allows the next scheduled vcpu to take the
lock, and has basically fairly good performance.
It also allows the spinlocks to take advantages of the hypervisor
features to make locks more efficient (spin and block, for example).
The cost to native execution is an extra direct call when using a
spinlock function. There's no overhead if CONFIG_PARAVIRT is turned
off.
The lock structure is fixed at a single "unsigned int", initialized to
zero, but the spinlock implementation can use it as it wishes.
Thanks to Thomas Friebel's Xen Summit talk "Preventing Guests from
Spinning Around" for pointing out this problem.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
AMD only supports "syscall" from 32-bit compat usermode.
Intel and Centaur(?) only support "sysenter" from 32-bit compat usermode.
Set the X86 feature bits accordingly, and set up the vdso in
accordance with those bits. On the offchance we run on in a 64-bit
environment which supports neither syscall nor sysenter from 32-bit
mode, then fall back to the int $0x80 vdso.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
fix:
arch/x86/xen/built-in.o: In function `set_page_prot':
enlighten.c:(.text+0x111d): undefined reference to `xen_raw_printk'
arch/x86/xen/built-in.o: In function `xen_start_kernel':
: undefined reference to `xen_raw_console_write'
arch/x86/xen/built-in.o: In function `xen_start_kernel':
: undefined reference to `xen_raw_console_write'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The Xen hypercall interface is allowed to trash any or all of the
argument registers, so we need to be careful that the kernel state
isn't damaged. On 32-bit kernels, the hypercall parameter registers
same as a regparm function call, so we've got away without explicit
clobbering so far. The 64-bit ABI defines lots of caller-save
registers, so save them all for safety. We can trim this set later by
re-distributing the responsibility for saving all these registers.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
64-bit hypercall interface can pass a maddr in one argument rather
than splitting it.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Use callback_op hypercall to register callbacks in a 32/64-bit
independent way (64-bit doesn't need a code segment, but that detail
is hidden in XEN_CALLBACK).
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When building initial pagetables in 64-bit kernel the pud/pmd pointer may
be in ioremap/fixmap space, so we need to walk the pagetable to look up the
physical address.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Set up the initial pagetables to map the kernel mapping into the
physical mapping space. This makes __va() usable, since it requires
physical mappings.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As a stopgap until Mike Travis's x86-64 gs-based percpu patches are
ready, provide workaround functions for x86_read/write_percpu for
Xen's use.
Specifically, this means that we can't really make use of vcpu
placement, because we can't use a single gs-based memory access to get
to vcpu fields. So disable all that for now.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We need extra pv_mmu_ops for 64-bit, to deal with the extra level of
pagetable.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The 64-bit calling convention for hypercalls uses different registers
from 32-bit. Annoyingly, gcc's asm syntax doesn't have a way to
specify one of the extra numeric reigisters in a constraint, so we
must use explicitly placed register variables. Given that we have to
do it for some args, may as well do it for all.
Also fix syntax gcc generates for the call instruction itself. We
need a plain direct call, but the asm expansion which works on 32-bit
generates a rip-relative addressing mode in 64-bit, which is treated
as an indirect call. The alternative is to pass the hypercall page
offset into the asm, and have it add it to the hypercall page start
address to generate the call.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
64-bit guests can pass 64-bit quantities in a single argument,
so fix up the hypercalls.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Copy 64-bit definitions of various interface structures into place.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
add xen_timer_resume() hook.
Timer resume should be done after event channel is resumed.
add xen_arch_resume() hook when ipi becomes usable after resume.
After resume, some cpu specific resource must be reinitialized
on ia64 that can't be set by another cpu.
However available hooks is run once on only one cpu so that ipi has
to be used.
During stop_machine_run() ipi can't be used because interrupt is masked.
So add another hook after stop_machine_run().
Another approach might be use resume hook which is run by
device_resume(). However device_resume() may be executed on
suspend error recovery path.
So it is necessary to determine whether it is executed on real resume path
or error recovery path.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This allows Xen's xen_cpu_up() to allocate a pda for the new CPU.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Call paravirt_pagetable_setup_{start,done}
These paravirt_ops functions were not being called on x86_64.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (249 commits)
powerpc: Fix pte_update for CONFIG_PTE_64BIT and !PTE_ATOMIC_UPDATES
powerpc: Fix a build problem on ppc32 with new DMA_ATTRs
ibm_newemac: Add MII mode support to the EMAC RGMII bridge.
powerpc: Don't spin on sync instruction at boot time
powerpc: Add VSX load/store alignment exception handler
powerpc: fix giveup_vsx to save registers correctly
powerpc: support for latencytop
powerpc: Remove unnecessary condition when sanity-checking WIMG bits
powerpc: Add PPC_FEATURE_PSERIES_PERFMON_COMPAT
powerpc: Add driver for Barrier Synchronization Register
powerpc: mman.h export fixups
powerpc/fsl: update crypto node definition and device tree instances
powerpc/fsl: Refactor device bindings
powerpc/85xx: Minor fixes for 85xxds and 8536ds board.
powerpc: Add 82xx/83xx/86xx to 6xx Multiplatform
powerpc/85xx: publish of device for cds platforms
powerpc/booke: don't reinitialize time base
powerpc/86xx: Refactor pic init
powerpc/CPM: Add i2c pins to dts and board setup
cpm_uart: Support uart_wait_until_sent()
...
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (102 commits)
[SCSI] scsi_dh: fix kconfig related build errors
[SCSI] sym53c8xx: Fix bogus sym_que_entry re-implementation of container_of
[SCSI] scsi_cmnd.h: remove double inclusion of linux/blkdev.h
[SCSI] make struct scsi_{host,target}_type static
[SCSI] fix locking in host use of blk_plug_device()
[SCSI] zfcp: Cleanup external header file
[SCSI] zfcp: Cleanup code in zfcp_erp.c
[SCSI] zfcp: zfcp_fsf cleanup.
[SCSI] zfcp: consolidate sysfs things into one file.
[SCSI] zfcp: Cleanup of code in zfcp_aux.c
[SCSI] zfcp: Cleanup of code in zfcp_scsi.c
[SCSI] zfcp: Move status accessors from zfcp to SCSI include file.
[SCSI] zfcp: Small QDIO cleanups
[SCSI] zfcp: Adapter reopen for large number of unsolicited status
[SCSI] zfcp: Fix error checking for ELS ADISC requests
[SCSI] zfcp: wait until adapter is finished with ERP during auto-port
[SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver
[SCSI] sg: Add target reset support
[SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC
[SCSI] sd: Move scsi_disk() accessor function to sd.h
...
Introduce a new API to register RPC services on IPv6 interfaces to allow
the NFS server and lockd to advertise on IPv6 networks.
Unlike rpcb_register(), the new rpcb_v4_register() function uses rpcbind
protocol version 4 to contact the local rpcbind daemon. The version 4
SET/UNSET procedures allow services to register address families besides
AF_INET, register at specific network interfaces, and register transport
protocols besides UDP and TCP. All of this functionality is exposed via
the new rpcb_v4_register() kernel API.
A user-space rpcbind daemon implementation that supports version 4 of the
rpcbind protocol is required in order to make use of this new API.
Note that rpcbind version 3 is sufficient to support the new rpcbind
facilities listed above, but most extant implementations use version 4.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (54 commits)
[MIPS] Remove mips_machtype for LASAT machines
[MIPS] Remove mips_machtype from EMMA2RH machines
[MIPS] Remove mips_machtype from ARC based machines
[MIPS] MTX-1 flash partition setup move to platform devices registration
[MIPS] TXx9: cleanup and fix some sparse warnings
[MIPS] TXx9: rename asm-mips/mach-jmr3927 to asm-mips/mach-tx39xx
[MIPS] remove machtype for group Toshiba
[MIPS] separate rbtx4927_time_init() and rbtx4937_time_init()
[MIPS] separate rbtx4927_arch_init() and rbtx4937_arch_init()
[MIPS] txx9_cpu_clock setup move to rbtx4927_time_init()
[MIPS] txx9_board_vec set directly without mips_machtype
[MIPS] IP22: Add platform device for Indy volume buttons
[MIPS] cmbvr4133: Remove support
[MIPS] remove wrppmc_machine_power_off()
[MIPS] replace inline assembler to cpu_wait()
[MIPS] IP22/28: Add platform devices for HAL2
[MIPS] TXx9: Update and merge defconfigs
[MIPS] TXx9: Make single kernel can support multiple boards
[MIPS] TXx9: Update defconfigs
[MIPS] TXx9: Reorganize PCI code
...
* 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
generic-ipi: more merge fallout
generic-ipi: merge fix
x86, visws: use mach-default/entry_arch.h
x86, visws: fix generic-ipi build
generic-ipi: fixlet
generic-ipi: fix s390 build bug
generic-ipi: fix linux-next tree build failure
fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
fix "smp_call_function: get rid of the unused nonatomic/retry argument"
on_each_cpu(): kill unused 'retry' parameter
smp_call_function: get rid of the unused nonatomic/retry argument
sh: convert to generic helpers for IPI function calls
parisc: convert to generic helpers for IPI function calls
mips: convert to generic helpers for IPI function calls
m32r: convert to generic helpers for IPI function calls
arm: convert to generic helpers for IPI function calls
alpha: convert to generic helpers for IPI function calls
ia64: convert to generic helpers for IPI function calls
powerpc: convert to generic helpers for IPI function calls
...
Fix trivial conflicts due to rcu updates in kernel/rcupdate.c manually
* 'core/rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (23 commits)
rcu classic: update qlen when cpu offline
rcu: make rcutorture even more vicious: invoke RCU readers from irq handlers (timers)
rcu: make quiescent rcutorture less power-hungry
rcu, rcutorture: make quiescent rcutorture less power-hungry
rcu: make rcutorture more vicious: reinstate boot-time testing
rcu: make rcutorture more vicious: add stutter feature
rcutorture: WARN_ON_ONCE(1) when detecting an error
rcu: remove unused field struct rcu_data::rcu_tasklet
Revert "prohibit rcutorture from being compiled into the kernel"
rcu: fix nf_conntrack_helper.c build bug
rculist.h: fix include in net/netfilter/nf_conntrack_netlink.c
rcu: remove duplicated include in kernel/rcupreempt.c
rcu: remove duplicated include in kernel/rcupreempt_trace.c
RCU, rculist.h: fix list iterators
rcu: fix rcu_try_flip_waitack_needed() to prevent grace-period stall
rculist.h: use the rcu API
rcu: split list.h and move rcu-protected lists into rculist.h
sched: 1Q08 RCU doc update, add call_rcu_sched()
rcu: add call_rcu_sched() and friends to rcutorture
rcu: add rcu_barrier_sched() and rcu_barrier_bh()
...
Commit 1ea0704e0d aka "mm: add a ptep_modify_prot transaction abstraction"
caused:
| CC init/main.o
|In file included from include2/asm/pgtable.h:68,
| from /home/bigeasy/git/linux-2.6-m68k/include/linux/mm.h:39,
| from include2/asm/uaccess.h:8,
| from /home/bigeasy/git/linux-2.6-m68k/include/linux/poll.h:13,
| from /home/bigeasy/git/linux-2.6-m68k/include/linux/rtc.h:113,
| from /home/bigeasy/git/linux-2.6-m68k/include/linux/efi.h:19,
| from /home/bigeasy/git/linux-2.6-m68k/init/main.c:43:
|/linux-2.6/include/asm-generic/pgtable.h: In function '__ptep_modify_prot_start':
|/linux-2.6/include/asm-generic/pgtable.h:209: error: implicit declaration of function 'ptep_get_and_clear'
|/linux-2.6/include/asm-generic/pgtable.h:209: error: incompatible types in return
|/linux-2.6/include/asm-generic/pgtable.h: In function '__ptep_modify_prot_commit':
|/linux-2.6/include/asm-generic/pgtable.h:220: error: implicit declaration of function 'set_pte_at'
|make[2]: *** [init/main.o] Error 1
|make[1]: *** [init] Error 2
|make: *** [sub-make] Error 2
on my m68knommu box.
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pass a more generic socket address type to nlmsvc_unlock_all_by_ip() to
allow for future support of IPv6. Also provide additional sanity
checking in failover_unlock_ip() when constructing the server's IP
address.
As an added bonus, provide clean kerneldoc comments on related NLM
interfaces which were recently added.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* 'sbp2-spindown' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
ieee1394: sbp2: spin disks down on suspend and shutdown
firewire: fw-sbp2: spin disks down on suspend and shutdown
ieee1394: sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares
firewire: fw-sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares
scsi: sd: optionally set power condition in START STOP UNIT
nlmsvc_lock calls nlmsvc_lookup_host to find a nlm_host struct. The
callers of this function, however, call nlmsvc_retrieve_args or
nlm4svc_retrieve_args, which also return a nlm_host struct.
Change nlmsvc_lock to take a host arg instead of calling
nlmsvc_lookup_host itself and change the callers to pass a pointer to
the nlm_host they've already found.
Since nlmsvc_testlock() now just uses the caller's reference, we no
longer need to get or release it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
nlmsvc_testlock calls nlmsvc_lookup_host to find a nlm_host struct. The
callers of this functions, however, call nlmsvc_retrieve_args or
nlm4svc_retrieve_args, which also return a nlm_host struct.
Change nlmsvc_testlock to take a host arg instead of calling
nlmsvc_lookup_host itself and change the callers to pass a pointer to
the nlm_host they've already found.
We take a reference to host in the place where nlmsvc_testlock()
previous did a new lookup, so the reference counting is unchanged from
before.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
AHCI: Remove an unnecessary flush from ahci_qc_issue
AHCI: speed up resume
[libata] Add support for VPD page b1
ata: endianness annotations in pata drivers
libata-eh: update atapi_eh_request_sense() to take @dev instead of @qc
[libata] sata_svw: update code comments relating to data corruption
libata/ahci: enclosure management support
libata: improve EH internal command timeout handling
libata: use ULONG_MAX to terminate reset timeout table
libata: improve EH retry delay handling
libata: consistently use msecs for time durations
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: (56 commits)
i2c: Add detection capability to new-style drivers
i2c: Call client_unregister for new-style devices too
i2c: Clean up old chip drivers
i2c-ibm_iic: Register child nodes
i2c: New-style EEPROM driver using device IDs
i2c: Export the i2c_bus_type symbol
i2c-au1550: Fix PM support
i2c-dev: Delete empty detach_client callback
i2c: Drop stray references to lm_sensors
i2c: Check for ACPI resource conflicts
i2c-ocores: basic PM support
i2c-sibyte: SWARM I2C board initialization
i2c-i801: Fix handling of error conditions
i2c-i801: Rename local variable temp to status
i2c-i801: Properly report bus arbitration loss
i2c-i801: Remove verbose debugging messages
i2c-algo-pcf: Drop unused struct members
i2c-algo-pcf: Multi-master lost-arbitration improvement
i2c: Deprecate the legacy gpio drivers
i2c-pxa: Initialize early
...
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (80 commits)
ide-floppy: fix unfortunate function naming
ide-tape: unify idetape_create_read/write_cmd
ide: add ide_pc_intr() helper
ide-{floppy,scsi}: read Status Register before stopping DMA engine
ide-scsi: add more debugging to idescsi_pc_intr()
ide-scsi: use pc->callback
ide-floppy: add more debugging to idefloppy_pc_intr()
ide-tape: always log debug info in idetape_pc_intr() if debugging is enabled
ide-tape: add ide_tape_io_buffers() helper
ide-tape: factor out DSC handling from idetape_pc_intr()
ide-{floppy,tape}: move checking of ->failed_pc to ->callback
ide: add ide_issue_pc() helper
ide: add PC_FLAG_DRQ_INTERRUPT pc flag
ide-scsi: move idescsi_map_sg() call out from idescsi_issue_pc()
ide: add ide_transfer_pc() helper
ide-scsi: set drive->scsi flag for devices handled by the driver
ide-{cd,floppy,tape}: remove checking for drive->scsi
ide: add PC_FLAG_ZIP_DRIVE pc flag
ide-tape: factor out waiting for good ireason from idetape_transfer_pc()
ide-tape: set PC_FLAG_DMA_IN_PROGRESS flag in idetape_transfer_pc()
...
* ide-tape.c: add 'drive' argument to idetape_update_buffers().
* Add generic ide_pc_intr() helper to ide-atapi.c and then
convert ide-{floppy,tape,scsi} device drivers to use it.
* ide-tape.c: remove no longer needed DBG_PC_INTR.
There should be no functional changes caused by this patch
(unless the debugging is explicitely compiled in).
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add generic ide_issue_pc() helper to ide-atapi.c and then
convert ide-{floppy,tape,scsi} device drivers to use it.
There should be no functional changes caused by this patch.
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add PC_FLAG_DRQ_INTERRUPT pc flag, set it in ide*_do_request()
and check for it (instead of checking for IDE*_FLAG_DRQ_INTERRUPT)
in ide*_issue_pc(). This is a preparation for adding generic
ide_issue_pc() helper.
There should be no functional changes caused by this patch.
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Add ide-atapi.c file for generic ATAPI support together with
CONFIG_IDE_ATAPI config option.
* Add generic ide_transfer_pc() helper to ide-atapi.c and then
convert ide-{floppy,tape,scsi} device drivers to use it.
There should be no functional changes caused by this patch.
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Add PC_FLAG_ZIP_DRIVE pc flag, set it in idefloppy_do_request()
and check for it (instead of checking for IDEFLOPPY_FLAG_ZIP_DRIVE)
in idefloppy_transfer_pc(). This is a preparation for adding
generic ide_transfer_pc() helper.
There should be no functional changes caused by this patch.
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
* Use PC_FLAG_DMA_OK flag instead of PC_FLAG_DMA_RECOMMENDED one.
* Remove no longer used PC_FLAG_DMA_RECOMMENDED flag.
There should be no functional changes caused by this patch.
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Merge pc->idefloppy_callback and pc->idetape_callback into pc->callback.
There should be no functional changes caused by this patch.
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
There should be no functional changes caused by this patch.
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
ide_do_drive_cmd is called only with ide_preempt action argument. So
we can remove the action argument in ide_do_drive_cmd and ide_action_t
typedef.
This patch also includes two minor cleanups: 1) ide_do_drive_cmd
always succeeds so we don't need the return value; 2) the callers use
blk_rq_init before ide_do_drive_cmd so there is no need to initialize
rq->errors.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Remove drive->ctl (it is always equal to 0x08 after init time).
While at it:
* Use ATA_DEVCTL_OBS define.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Since scc_pata host driver no longer uses IDE PCI layer / ide_dma_setup()
and all other ->mmio users set also IDE_HFLAG_MMIO host flag we can safely
remove ->mmio flag.
There should be no functional changes caused by this patch.
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Move IRQ unmasking out from ->tf_load method to its users.
There should be no functional changes caused by this patch
(SELECT_MASK() is NOP except for hpt366, icside and sgiioc4).
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Always call SELECT_MASK(..., 0) in ide_tf_load() (needs to be done
to match ide_set_irq(..., 1)) and then remove IDE_TFLAG_NO_SELECT_MASK
taskfile flag.
This change should only affect hpt366 and icside host drivers since
->maskproc(..., 0) for sgiioc4 is equivalent to ide_set_irq(..., 1).
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
All the users of blk_end_sync_rq has gone (they are converted to use
blk_execute_rq). This unexports blk_end_sync_rq.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
ide_init_drive_cmd just calls blk_rq_init. This converts the users of
ide_init_drive_cmd to use blk_rq_init directly and removes
ide_init_drive_cmd.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Borislav Petkov <petkovbb@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This is the LASAT part of the mips_machtype removal.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This is the EMMA2RH part of the mips_machtype removal.
[Ralf: Fixed to the #error statements]
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This is the ARC part of the mips_machtype removal.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* Do not return void value
* Make some functions static
* Do not include unnecessary bootinfo.h
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Rename mach-jmr3927 directory to more proper name to make adding other
platforms easier.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
It cannot be built for a long time and nobody maintains it.
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Make single kernel can be used on RBTX4927/37/38. Also make
some SoC-specific code independent from board-specific code.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Split out PCIC dependent code and SoC dependent code from board dependent
code. Now TX4927 PCIC code is independent from TX4927/TX4938 SoC code.
Also fix some build problems on CONFIG_PCI=n.
As a bonus, "FPCIB0 Backplane Support" is available for all TX39/TX49 boards
and PCI66 support is available for all TX49 boards.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Move arch/mips/{jmr3927,tx4927,tx4938} into arch/mips/txx9/ tree.
This will help more code sharing and maintainance.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Declare pci_probe_only, etc. in asm-mips/pci.h file. This will fix
some sparse warnings.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
SGI-IP28 is running in so called slow mode, when kernel is started
from the PROM. PROM calls must be done in slow mode otherwise the
PROM will issue an error. To get better memory performance we now
switch to normal mode, when the PROM is no longer needed.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The third operand to 'ins' must be a constant int, not a register.
[Ralf: The bug was actually intensional. Some versions used to throw an
error under certain circumstances for code like:
static inline void f(unsigned nr, unsigned *p)
{
unsigned short bit = nr & 5;
if (__builtin_constant_p(bit)) {
__asm__ __volatile__ (" foo %0, %1" : "=m" (*p) : "i" (bit));
} else {
/* Do something else. */
}
}
because gcc was not able to figure out that the "i" constraint was possibly
at the early stage when the constraint are getting verified. The solution
was using "ri" instead of "i". The "ri" would keep gcc happy but in the
end for code generation always the "i" constraint would be satisfied. The
problem afair originally appeared in the i386 io.h and also hit it's mips
equivalent. From there the workaround spread to many of the inline
assembler functions.]
Signed-off-by: David Daney <ddaney@avtrex.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Document a few more register bits provided by the MB ASIC used on R4000SC
(KN04) and R4400SC (KN05) CPU daughtercards with the DECstation.
Reverse-engineered and not documented anywhere else to the best of my
knowledge. Bit names appended to the last underscore the same as reported
by the firmware in register dumps.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Never terribly functional or popular, plagued by hard to fix bugs the time
to say goodbye has more than arrived.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch fixes the following sparse warning:
<<<<<<<<
arch/mips/kernel/early_printk.c:35:13: warning: symbol 'setup_early_printk'
was not declared. Should it be static?
<<<<<<<<
The fix is to define a prototype of the setup_early_printk() function and
to include the appropriate header into arch/mips/kernel/early_printk.c.
[Ralf: Sorted includes again]
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
The isa_slot_offset variable and its __ISA_IO_base macro is not used
anywhere anymore. It does not look like a decent interface per today's
standards either. Remove both including all places of initialization.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
It is not used anywhere in tree.
Signed-off-by: David Daney <ddaney@avtrex.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
* 'core/topology' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
cputopology: always define CPU topology information, clean up
cpu topology: always define CPU topology information
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Fix compile error with CONFIG_AS_CFI=n
Documentation: document debugpat commandline option
x86: sanitize Kconfig
x86, suspend, acpi: correct and add comments about Big Real Mode
x86, suspend, acpi: enter Big Real Mode
Fixed trivial conflict in include/asm-x86/dwarf2.h due to just using
different names for "cfi_ignore" (vs "__cfi_ignore") macro.
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits)
ext4: Documention update for new ordered mode and delayed allocation
ext4: do not set extents feature from the kernel
ext4: Don't allow nonextenst mount option for large filesystem
ext4: Enable delalloc by default.
ext4: delayed allocation i_blocks fix for stat
ext4: fix delalloc i_disksize early update issue
ext4: Handle page without buffers in ext4_*_writepage()
ext4: Add ordered mode support for delalloc
ext4: Invert lock ordering of page_lock and transaction start in delalloc
mm: Add range_cont mode for writeback
ext4: delayed allocation ENOSPC handling
percpu_counter: new function percpu_counter_sum_and_set
ext4: Add delayed allocation support in data=writeback mode
vfs: add hooks for ext4's delayed allocation support
jbd2: Remove data=ordered mode support using jbd buffer heads
ext4: Use new framework for data=ordered mode in JBD2
jbd2: Implement data=ordered mode handling via inodes
vfs: export filemap_fdatawrite_range()
ext4: Fix lock inversion in ext4_ext_truncate()
ext4: Invert the locking order of page_lock and transaction start
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (52 commits)
IB/mlx4: Use kzalloc() for new QPs so flags are initialized to 0
mlx4_core: Use MOD_STAT_CFG command to get minimal page size
RDMA/cma: Simplify locking needed for serialization of callbacks
RDMA/addr: Keep pointer to netdevice in struct rdma_dev_addr
RDMA/cxgb3: Fixes for zero STag
RDMA/core: Add local DMA L_Key support
IB/mthca: Fix check of max_send_sge for special QPs
IB/mthca: Use round_jiffies() for catastrophic error polling timer
IB/mthca: Remove "stop" flag for catastrophic error polling timer
IPoIB: Double default RX/TX ring sizes
IPoIB/cm: Reduce connected mode TX object size
IB/ipath: Use IEEE OUI for vendor_id reported by ibv_query_device()
IPoIB: Use dev_set_mtu() to change mtu
IPoIB: Use rtnl lock/unlock when changing device flags
IPoIB: Get rid of ipoib_mcast_detach() wrapper
IPoIB: Only set Q_Key once: after joining broadcast group
IPoIB: Remove priv->mcast_mutex
IPoIB: Remove unused IPOIB_MCAST_STARTED code
RDMA/cxgb3: Set rkey field for new memory windows in iwch_alloc_mw()
RDMA/nes: Get rid of ring_doorbell parameter of nes_post_cqp_request()
...
Do not automatically "select" SCSI_DH for dm-multipath. If SCSI_DH
doesn't exist,just do not allow hardware handlers to be used.
Handle SCSI_DH being a module also. Make sure it doesn't allow DM_MULTIPATH
to be compiled in when SCSI_DH is a module.
[jejb: added comment for Kconfig syntax]
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This adds reading and using of enable_timeout from the CIS
Signed-off-by: Benzi Zbit <benzi.zbit@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
AS arch/x86/lib/csum-copy_64.o
arch/x86/lib/csum-copy_64.S: Assembler messages:
arch/x86/lib/csum-copy_64.S:48: Error: Macro `ignore' was already defined
make[1]: *** [arch/x86/lib/csum-copy_64.o] Error 1
make: *** [arch/x86/lib] Error 2
It appears that csum-copy_64.S and dwarf2.h both define an ignore macro.
I would expect one of them can be renamed quite easily, unless they
are references elsewhere.
Caused-by-commit: 392a0fc96b
x86: merge dwarf2 headers
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Relax requirements on host controllers and only require that they do not
report a transfer count than is larger than the actual one (i.e. a lower
value is okay). This is how many other parts of the kernel behaves so
upper layers should already be prepared to handle that scenario. This
gives us a performance boost on MMC cards.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
This is a driver for the MMC controller on the AP7000 chips from
Atmel. It should in theory work on AT91 systems too with some
tweaking, but since the DMA interface is quite different, it's not
entirely clear if it's worth merging this with the at91_mci driver.
This driver has been around for a while in BSPs and kernel sources
provided by Atmel, but this particular version uses the generic DMA
Engine framework (with the slave extensions) instead of an
avr32-only DMA controller framework.
This driver can also use PIO transfers when no DMA channels are
available, and for transfers where using DMA may be difficult or
impractical for some reason (e.g. the DMA setup overhead is usually
not worth it for very short transfers, and badly aligned buffers or
lengths are difficult to handle.)
Currently, the driver only support PIO transfers. DMA support has been
split out to a separate patch to hopefully make it easier to review.
The driver has been tested using mmc-block and ext3fs on several SD,
SDHC and MMC+ cards. Reads and writes work fine, with read transfer
rates up to 3.5 MiB/s on fast cards with debugging disabled.
The driver has also been tested using the mmc_test module on the same
cards. All tests except 7, 9, 15 and 17 succeed. The first two are
unsupported by all the cards I have, so I don't know if the driver
handles this correctly. The last two fail because the hardware flags a
Data CRC Error instead of a Data Timeout error. I'm not sure how to deal
with that.
Documentation for this controller can be found in many data sheets from
Atmel, including the AT32AP7000 data sheet which can be found here:
http://www.atmel.com/dyn/products/datasheets.asp?family_id=682
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
This patch fixes sdio_io sparse errors.
This fix changes signature of API functions,
changing
unsigned char -> u8
unsigned short -> u16
unsigned long -> u32 - this was probably a bug in 64 bit platforms
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Ensure that we have physical media present before attempting to
send a request to a card. This ensures that we do not get flooded
by errors from commands that can never be completed timing out.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Support for inverting the sense of the MMC driver's write
protect detection line.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
This patch adds platform data support to the s3mci driver. This allows
flexible board-specific configuration of set_power, card detect and read only
pins.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
This is the latest S3C MMC/SD driver by Thomas Kleffel
with cleanups as suggested by AKPM done by Ben Dooks.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Thomas Kleffel <tk@maintech.de>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
There are a lot of crappy controllers out there that cannot handle
all the request sizes that the MMC/SD/SDIO specifications require.
In case the card driver can pad the data to overcome the problems,
this commit adds a helper that calculates how much that padding
should be.
A corresponding helper is also added for SDIO, but it can also deal
with all the complexities of splitting up a large transfer efficiently.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Remove the DB1200 board-specific functions (card present, read-only,
activity LED methods) and instead add platform data which is passed
to the driver. This also allows for platforms to implement other
carddetect schemes (e.g. dedicated irq) without having to pollute the
driver code. The poll timer (used for pb1200) is kept for compatibility.
With the board-specific stuff gone, the driver's ->probe() code can be
cleaned up considerably.
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
The at91 mci controller internal state machine seems to often crash. This can
be fixed by resetting the controller after each command for at91rm9200 and by
setting the MCI_BLKR register on at91sam926*.
Signed-off-by: Marc Pignat <marc.pignat@hevs.ch>
Signed-off-by: Hans J Koch <hjk@linutronix.de>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Now get_ro() callback must return 0/1 values for its logical states, and
negative errno values in case of error. If particular host instance doesn't
support RO/WP switch, it should return -ENOSYS.
This patch changes some hosts in two ways:
1. Now functions should be smart to not return negative values in
"RO asserted" case (particularly gpio_ calls could return negative
values for the outermost GPIOs).
Also, board code usually passes get_ro() callbacks that directly return
gpioreg & bit result, so at91_mci, imxmmc, pxamci and mmc_spi's get_ro()
handlers need take special care when returning platform's values to the
mmc core.
2. In case of host instance didn't implement get_ro() callback, it should
really return -ENOSYS and let the mmc core decide what to do about it
(mmc core thinks the same way as the hosts, so it isn't functional
change).
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
This patch adds new platform data variable "caps", so platforms
could pass theirs capabilities into MMC core (for example, platforms
without interrupt on the CD line will most probably want to pass
MMC_CAP_NEEDS_POLL).
New platform get_cd() callback provided to optimize polling.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Some hosts (and boards that use mmc_spi) do not use interrupts on the CD
line, so they can't trigger mmc_detect_change. We want to poll the card
and see if there was a change. 1 second poll interval seems resonable.
This patch also implements .get_cd() host operation, that could be used
by the hosts that are able to report card-detect status without need to
talk MMC.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
This patch removes a CVS tag that wasn't updated for a long time.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
JMicron chips sometimes have two interfaces to work around limitations
in Microsoft's sdhci driver. This patch allows us to use either interface.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Enable: PM_SUSPEND_MEM -> Blackfin Hibernate to SDRAM
This feature requires a special bootloader (u-boot)
supporting return from hibernate.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Add netif_addr_{lock,unlock}{,_bh}() helpers.
Use them to protect operations that operate on or read
the network device unicast and multicast address lists.
Also use them in cases where the code simply wants to
block calls into the driver's ->set_rx_mode() and
->set_multicast_list() methods.
Signed-off-by: David S. Miller <davem@davemloft.net>
This will be used to protect the per-device unicast and multicast
address lists, as well as the callbacks into the drivers which
configure such state such as ->set_rx_mode() and ->set_multicast_list().
Signed-off-by: David S. Miller <davem@davemloft.net>
Keep a pointer to the local (src) netdevice in struct rdma_dev_addr,
and copy it in as part of rdma_copy_addr(). Use rdma_translate_ip()
in cma_new_conn_id() to reduce some code duplication and also make
sure the src_dev member gets set.
In a high-availability configuration the netdevice pointer can be used
by the RDMA CM to align RDMA sessions to use the same links as the IP
stack does under fail-over and route change cases.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- Change the IB_DEVICE_ZERO_STAG flag to the transport-neutral name
IB_DEVICE_LOCAL_DMA_LKEY, which is used by iWARP RNICs to indicate 0
STag support and IB HCAs to indicate reserved L_Key support.
- Add a u32 local_dma_lkey member to struct ib_device. Drivers fill
this in with the appropriate local DMA L_Key (if they support it).
- Fix up the drivers using this flag.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add support for handling the IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK
flag by using the per-multicast group loopback blocking feature of
mlx4 hardware.
Signed-off-by: Ron Livne <ronli@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch also adds a creation flag for QPs,
IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK, which when set means that
multicast sends from the QP to a group that the QP is attached to will
not be looped back to the QP's receive queue. This can be used to
save receive resources when a consumer does not want a local copy of
multicast traffic; for example IPoIB must waste CPU time throwing away
such local copies of multicast traffic.
This patch also adds a device capability flag that shows whether a
device supports this feature or not.
Signed-off-by: Ron Livne <ronli@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch adds a sysfs attribute group called "proto_stats" under
/sys/class/infiniband/$device/ and populates this group with protocol
statistics if they exist for a given device. Currently, only iWARP
stats are defined, but the code is designed to allow InfiniBand
protocol stats if they become available. These stats are per-device
and more importantly -not- per port.
Details:
- Add union rdma_protocol_stats in ib_verbs.h. This union allows
defining transport-specific stats. Currently only iwarp stats are
defined.
- Add struct iw_protocol_stats to define the current set of iwarp
protocol stats.
- Add new ib_device method called get_proto_stats() to return protocol
statistics.
- Add logic in core/sysfs.c to create iwarp protocol stats attributes
if the device is an RNIC and has a get_proto_stats() method.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This patch adds support for the IB "base memory management extension"
(BMME) and the equivalent iWARP operations (which the iWARP verbs
mandates all devices must implement). The new operations are:
- Allocate an ib_mr for use in fast register work requests.
- Allocate/free a physical buffer lists for use in fast register work
requests. This allows device drivers to allocate this memory as
needed for use in posting send requests (eg via dma_alloc_coherent).
- New send queue work requests:
* send with remote invalidate
* fast register memory region
* local invalidate memory region
* RDMA read with invalidate local memory region (iWARP only)
Consumer interface details:
- A new device capability flag IB_DEVICE_MEM_MGT_EXTENSIONS is added
to indicate device support for these features.
- New send work request opcodes IB_WR_FAST_REG_MR, IB_WR_LOCAL_INV,
IB_WR_RDMA_READ_WITH_INV are added.
- A new consumer API function, ib_alloc_mr() is added to allocate
fast register memory regions.
- New consumer API functions, ib_alloc_fast_reg_page_list() and
ib_free_fast_reg_page_list() are added to allocate and free
device-specific memory for fast registration page lists.
- A new consumer API function, ib_update_fast_reg_key(), is added to
allow the key portion of the R_Key and L_Key of a fast registration
MR to be updated. Consumers call this if desired before posting
a IB_WR_FAST_REG_MR work request.
Consumers can use this as follows:
- MR is allocated with ib_alloc_mr().
- Page list memory is allocated with ib_alloc_fast_reg_page_list().
- MR R_Key/L_Key "key" field is updated with ib_update_fast_reg_key().
- MR made VALID and bound to a specific page list via
ib_post_send(IB_WR_FAST_REG_MR)
- MR made INVALID via ib_post_send(IB_WR_LOCAL_INV),
ib_post_send(IB_WR_RDMA_READ_WITH_INV) or an incoming send with
invalidate operation.
- MR is deallocated with ib_dereg_mr()
- page lists dealloced via ib_free_fast_reg_page_list().
Applications can allocate a fast register MR once, and then can
repeatedly bind the MR to different physical block lists (PBLs) via
posting work requests to a send queue (SQ). For each outstanding
MR-to-PBL binding in the SQ pipe, a fast_reg_page_list needs to be
allocated (the fast_reg_page_list is owned by the low-level driver
from the consumer posting a work request until the request completes).
Thus pipelining can be achieved while still allowing device-specific
page_list processing.
The 32-bit fast register memory key/STag is composed of a 24-bit index
and an 8-bit key. The application can change the key each time it
fast registers thus allowing more control over the peer's use of the
key/STag (ie it can effectively be changed each time the rkey is
rebound to a page list).
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Remove subversion $Id lines and improve readability by fixing other
coding style problems pointed out by checkpatch.pl.
Signed-off-by: Dotan Barak <dotanba@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The license text for several files references a third software license
that was inadvertently copied in. Update the license to what was
intended. This update was based on a request from HP.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
There are ICMP_XXX_STATS that are not used in the kernel, so I remove
them, not to "just patch" them later. But if there's some sense in
keeping them, kick me - I will remake this set keeping them.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This routine deals with ICMP statistics, but doesn't have a
struct net at hands, so add one.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Store the VLAN tag in the auxillary data/tpacket2_hdr so userspace can
properly deal with hardware VLAN tagging/stripping.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The tpacket_hdr is not 64 bit clean due to use of an unsigned long
and can't be extended because the following struct sockaddr_ll needs
to be at a fixed offset.
Add support for a version 2 tpacket protocol that removes these
limitations.
Userspace can query the header size through a new getsockopt option
and change the protocol version through a setsockopt option. The
changes needed to switch to the new protocol version are:
1. replace struct tpacket_hdr by struct tpacket2_hdr
2. query header len and save
3. set protocol version to 2
- set up ring as usual
4. for getting the sockaddr_ll, use (void *)hdr + TPACKET_ALIGN(hdrlen)
instead of (void *)hdr + TPACKET_ALIGN(sizeof(struct tpacket_hdr))
Steps 2 and 4 can be omitted if the struct sockaddr_ll isn't needed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When VLAN header stripping is used, packets currently bypass packet
sockets (and other network taps) completely. For locally existing
VLANs, they appear directly on the VLAN device, for unknown VLANs
they are silently dropped.
Add a new function netif_nit_deliver() to deliver incoming packets
to all network interface taps and use it in __vlan_hwaccel_rx() to
make VLAN packets visible on the underlying device.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use a real skb member to store the skb to avoid clashes with qdiscs,
which are allowed to use the cb area themselves. As currently only real
devices that consume the skb set the NETIF_F_HW_VLAN_TX flag, no explicit
invalidation is neccessary.
The new member fills a hole on 64 bit, the skb layout changes from:
__u32 mark; /* 172 4 */
sk_buff_data_t transport_header; /* 176 4 */
sk_buff_data_t network_header; /* 180 4 */
sk_buff_data_t mac_header; /* 184 4 */
sk_buff_data_t tail; /* 188 4 */
/* --- cacheline 3 boundary (192 bytes) --- */
sk_buff_data_t end; /* 192 4 */
/* XXX 4 bytes hole, try to pack */
to
__u32 mark; /* 172 4 */
__u16 vlan_tci; /* 176 2 */
/* XXX 2 bytes hole, try to pack */
sk_buff_data_t transport_header; /* 180 4 */
sk_buff_data_t network_header; /* 184 4 */
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
With new userspace libpciaccess we can get a conflicting mapping
on the PCIE GART table in the video RAM. Always try and map it _wc.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Because the pte is now 64-bits the compiler was optimizing the update
to always clear the upper 32-bits of the pte. We need to ensure the
clr mask is treated as an unsigned long long to get the proper behavior.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch eliminates an unneeded parameter when creating a low-level
TIPC port object. Instead of returning both the pointer to the port
structure and the port's reference ID, it now returns only the pointer
since the port structure contains the reference ID as one of its fields.
Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Please see the following thread to get some context on this
http://marc.info/?l=linux-netdev&m=121564433018903&w=2
Basically the issue is that current multi-cast filtering stuff in
the TUN/TAP driver is seriously broken.
Original patch went in without proper review and ACK. It was broken and
confusing to start with and subsequent patches broke it completely.
To give you an idea of what's broken here are some of the issues:
- Very confusing comments throughout the code that imply that the
character device is a network interface in its own right, and that packets
are passed between the two nics. Which is completely wrong.
- Wrong set of ioctls is used for setting up filters. They look like
shortcuts for manipulating state of the tun/tap network interface but
in reality manipulate the state of the TX filter.
- ioctls that were originally used for setting address of the the TX filter
got "fixed" and now set the address of the network interface itself. Which
made filter totaly useless.
- Filtering is done too late. Instead of filtering early on, to avoid
unnecessary wakeups, filtering is done in the read() call.
The list goes on and on :)
So the patch cleans all that up. It introduces simple and clean interface for
setting up TX filters (TUNSETTXFILTER + tun_filter spec) and does filtering
before enqueuing the packets.
TX filtering is useful in the scenarios where TAP is part of a bridge, in
which case it gets all broadcast, multicast and potentially other packets when
the bridge is learning. So for example Ethernet tunnelling app may want to
setup TX filters to avoid tunnelling multicast traffic. QEMU and other
hypervisors can push RX filtering that is currently done in the guest into the
host context therefore saving wakeups and unnecessary data transfer.
Signed-off-by: Max Krasnyansky <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 70f1bba4 ("x86: use ignore macro instead of hash comment") breaks
the 64-bit x86 build on toolchains that have CONFIG_AS_CFI undefined with:
arch/x86/lib/csum-copy_64.S:48: Error: Macro `ignore' was already defined
because <asm/dwarf2.h> now uses the ignore macro name itself. Fix this
by changing to __cfi_ignore in dwarf2.h.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
giveup_vsx didn't save the FPU and VMX regsiters. Change it to be
like giveup_fpr/altivec which save these registers.
Also update call sites where FPU and VMX are already saved to use the
original giveup_vsx (renamed to __giveup_vsx).
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Background from Maynard Johnson:
As of POWER6, a set of 32 common events is defined that must be
supported on all future POWER processors. The main impetus for this
compat set is the need to support partition migration, especially from
processor P(n) to processor P(n+1), where performance software that's
running in the new partition may not be knowledgeable about processor
P(n+1). If a performance tool determines it does not support the
physical processor, but is told (via the
PPC_FEATURE_PSERIES_PERFMON_COMPAT bit) that the processor supports
the notion of the PMU compat set, then the performance tool can
surface just those events to the user of the tool.
PPC_FEATURE_PSERIES_PERFMON_COMPAT indicates that the PMU supports at
least this basic subset of events which is compatible across POWER
processor lines.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit ef3d3246a0 ("powerpc/mm: Add Strong
Access Ordering support") in the powerpc/{next,master} tree caused the
following in a powerpc allmodconfig build:
usr/include/asm/mman.h requires linux/mm.h, which does not exist in exported headers
We should not use CONFIG_PPC64 in an unprotected (by __KERNEL__)
section of an exported include file and linux/mm.h is not exported. So
protect the whole section that is CONFIG_PPC64 with __KERNEL__ and put
the two introduced includes in there as well.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'for-2.6.27' of git://git.infradead.org/users/dwmw2/firmware-2.6: (64 commits)
firmware: convert sb16_csp driver to use firmware loader exclusively
dsp56k: use request_firmware
edgeport-ti: use request_firmware()
edgeport: use request_firmware()
vicam: use request_firmware()
dabusb: use request_firmware()
cpia2: use request_firmware()
ip2: use request_firmware()
firmware: convert Ambassador ATM driver to request_firmware()
whiteheat: use request_firmware()
ti_usb_3410_5052: use request_firmware()
emi62: use request_firmware()
emi26: use request_firmware()
keyspan_pda: use request_firmware()
keyspan: use request_firmware()
ttusb-budget: use request_firmware()
kaweth: use request_firmware()
smctr: use request_firmware()
firmware: convert ymfpci driver to use firmware loader exclusively
firmware: convert maestro3 driver to use firmware loader exclusively
...
Fix up trivial conflicts with BKL removal in drivers/char/dsp56k.c and
drivers/char/ip2/ip2main.c manually.
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (241 commits)
[ARM] 5171/1: ep93xx: fix compilation of modules using clocks
[ARM] 5133/2: at91sam9g20 defconfig file
[ARM] 5130/4: Support for the at91sam9g20
[ARM] 5160/1: IOP3XX: gpio/gpiolib support
[ARM] at91: Fix NAND FLASH timings for at91sam9x evaluation kits.
[ARM] 5084/1: zylonite: Register AC97 device
[ARM] 5085/2: PXA: Move AC97 over to the new central device declaration model
[ARM] 5120/1: pxa: correct platform driver names for PXA25x and PXA27x UDC drivers
[ARM] 5147/1: pxaficp_ir: drop pxa_gpio_mode calls, as pin setting
[ARM] 5145/1: PXA2xx: provide api to control IrDA pins state
[ARM] 5144/1: pxaficp_ir: cleanup includes
[ARM] pxa: remove pxa_set_cken()
[ARM] pxa: allow clk aliases
[ARM] Feroceon: don't disable BPU on boot
[ARM] Orion: LED support for HP mv2120
[ARM] Orion: add RD88F5181L-FXO support
[ARM] Orion: add RD88F5181L-GE support
[ARM] Orion: add Netgear WNR854T support
[ARM] s3c2410_defconfig: update for current build
[ARM] Acer n30: Minor style and indentation fixes.
...
This includes PXA work up to the SPI changes for the initial merge,
since e172274ccc depends on the SPI
tree being merged.
Conflicts:
arch/arm/configs/em_x270_defconfig
arch/arm/configs/xm_x270_defconfig
* 'core/softirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
softirq: remove irqs_disabled warning from local_bh_enable
softirq: remove initialization of static per-cpu variable
Remove argument from open_softirq which is always NULL
* 'core/printk' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, generic: mark early_printk as asmlinkage
printk: export console_drivers
printk: remember the message level for multi-line output
printk: refactor processing of line severity tokens
printk: don't prefer unsuited consoles on registration
printk: clean up recursion check related static variables
namespacecheck: more kernel/printk.c fixes
namespacecheck: fix kernel printk.c
* 'core/locking' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
lockdep: fix kernel/fork.c warning
lockdep: fix ftrace irq tracing false positive
lockdep: remove duplicate definition of STATIC_LOCKDEP_MAP_INIT
lockdep: add lock_class information to lock_chain and output it
lockdep: add lock_class information to lock_chain and output it
lockdep: output lock_class key instead of address for forward dependency output
__mutex_lock_common: use signal_pending_state()
mutex-debug: check mutex magic before owner
Fixed up conflict in kernel/fork.c manually
* 'sched/new-API-sched_setscheduler' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: add new API sched_setscheduler_nocheck: add a flag to control access checks
* 'tracing/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (228 commits)
ftrace: build fix for ftraced_suspend
ftrace: separate out the function enabled variable
ftrace: add ftrace_kill_atomic
ftrace: use current CPU for function startup
ftrace: start wakeup tracing after setting function tracer
ftrace: check proper config for preempt type
ftrace: trace schedule
ftrace: define function trace nop
ftrace: move sched_switch enable after markers
ftrace: prevent ftrace modifications while being kprobe'd, v2
fix "ftrace: store mcount address in rec->ip"
mmiotrace broken in linux-next (8-bit writes only)
ftrace: avoid modifying kprobe'd records
ftrace: freeze kprobe'd records
kprobes: enable clean usage of get_kprobe
ftrace: store mcount address in rec->ip
ftrace: build fix with gcc 4.3
namespacecheck: fixes
ftrace: fix "notrace" filtering priority
ftrace: fix printout
...
drivers/pci/pci.c needs pm_wakeup.h since it uses device_set_wakup_capable().
The latter also needs to be stubbed out for !CONFIG_PM.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The configfs operations ->make_item() and ->make_group() currently
return a new item/group. A return of NULL signifies an error. Because
of this, -ENOMEM is the only return code bubbled up the stack.
Multiple folks have requested the ability to return specific error codes
when these operations fail. This patch adds that ability by changing the
->make_item/group() ops to return an int.
Also updated are the in-kernel users of configfs.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
* 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (76 commits)
sched_clock: and multiplier for TSC to gtod drift
sched_clock: record TSC after gtod
sched_clock: only update deltas with local reads.
sched_clock: fix calculation of other CPU
sched_clock: stop maximum check on NO HZ
sched_clock: widen the max and min time
sched_clock: record from last tick
sched: fix accounting in task delay accounting & migration
sched: add avg-overlap support to RT tasks
sched: terminate newidle balancing once at least one task has moved over
sched: fix warning
sched: build fix
sched: sched_clock_cpu() based cpu_clock(), lockdep fix
sched: export cpu_clock
sched: make sched_{rt,fair}.c ifdefs more readable
sched: bias effective_load() error towards failing wake_affine().
sched: incremental effective_load()
sched: correct wakeup weight calculations
sched: fix mult overflow
sched: update shares on wakeup
...
Add a mechanism to let new-style i2c drivers optionally autodetect
devices they would support on selected buses and ask i2c-core to
instantiate them. This is a replacement for legacy i2c drivers, much
cleaner.
Where drivers had to implement both a legacy i2c_driver and a
new-style i2c_driver so far, this mechanism makes it possible to get
rid of the legacy i2c_driver and implement both enumerated and
detected device support with just one (new-style) i2c_driver.
Here is a quick conversion guide for these drivers, step by step:
* Delete the legacy driver definition, registration and removal.
Delete the attach_adapter and detach_client methods of the legacy
driver.
* Change the prototype of the legacy detect function from
static int foo_detect(struct i2c_adapter *adapter, int address, int kind);
to
static int foo_detect(struct i2c_client *client, int kind,
struct i2c_board_info *info);
* Set the new-style driver detect callback to this new function, and
set its address_data to &addr_data (addr_data is generally provided
by I2C_CLIENT_INSMOD.)
* Add the appropriate class to the new-style driver. This is
typically the class the legacy attach_adapter method was checking
for. Class checking is now mandatory (done by i2c-core.) See
<linux/i2c.h> for the list of available classes.
* Remove the i2c_client allocation and freeing from the detect
function. A pre-allocated client is now handed to you by i2c-core,
and is freed automatically.
* Make the detect function fill the type field of the i2c_board_info
structure it was passed as a parameter, and return 0, on success. If
the detection fails, return -ENODEV.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Add a new-style driver for most I2C EEPROMs, giving sysfs read/write
access to their data. Tested with various chips and clock rates.
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Export the root of the i2c bus so that PowerPC device tree code can
iterate over devices on the i2c bus.
Signed-off-by: Jon Smirl <jonsmirl@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Implementing detach_client is optional, so there is no point in
an empty implementation.
Likewise, i2c driver IDs are optional, and we don't need one.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Struct members udelay and timeout aren't used anywhere, so drop them.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Eric Brower <ebrower@gmail.com>
Improve lost-arbitration handling of PCF8584. This is necessary for
support of a currently out-of-kernel driver for Sun Microsystems E250
environmental management; perhaps others.
Signed-off-by: Eric Brower <ebrower@gmail.com>
Acked-by: Dan Smolik <marvin@mydatex.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Let general purpose I2C/SMBus bus drivers add SPD to their class. Once
this is done, we will be able to tell the eeprom driver to only probe
for SPD EEPROMs and similar on these buses.
Note that I took a conservative approach here, adding I2C_CLASS_SPD to
many drivers that have no idea whether they can host SPD EEPROMs or not.
This is to make sure that the eeprom driver doesn't stop probing buses
where SPD EEPROMs or equivalent live.
So, bus driver maintainers and users should feel free to remove the SPD
class from drivers those buses never have SPD EEPROMs or they don't
want the eeprom driver to bind to them. Likewise, feel free to add the
SPD class to any bus driver I might have missed.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Let framebuffer drivers set their I2C bus class to DDC. Once this is
done, we will be able to tell the eeprom driver to only probe for
EDID EEPROMs on these buses.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Function i2c_smbus_write_quick has no users left, so we can delete it.
Also update the list of these helper functions which are gone but
could be added back if needed.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
This patch contains the scheduled removal of i2c-i810, i2c-prosavage
and i2c-savage4.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6: (31 commits)
avr32: Fix typo of IFSR in a comment in the PIO header file
avr32: Power Management support ("standby" and "mem" modes)
avr32: Add system device for the internal interrupt controller (intc)
avr32: Add simple SRAM allocator
avr32: Enable SDRAMC clock at startup
rtc-at32ap700x: Enable wakeup
macb: Basic suspend/resume support
atmel_serial: Drain console TX shifter before suspending
atmel_serial: Fix build on avr32 with CONFIG_PM enabled
avr32: Use a quicklist for PTE allocation as well
avr32: Use a quicklist for PGD allocation
avr32: Cover the kernel page tables in the user PGDs
avr32: Store virtual addresses in the PGD
avr32: Remove useless zeroing of swapper_pg_dir at startup
avr32: Clean up and optimize the TLB operations
avr32: Rename at32ap.c -> pdc.c
avr32: Move setup_platform() into chip-specific file
avr32: Kill special exception handler sections
avr32: Kill unneeded #include <asm/pgalloc.h> from asm/mmu_context.h
avr32: Clean up time.c #includes
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (25 commits)
security: remove register_security hook
security: remove dummy module fix
security: remove dummy module
security: remove unused sb_get_mnt_opts hook
LSM/SELinux: show LSM mount options in /proc/mounts
SELinux: allow fstype unknown to policy to use xattrs if present
security: fix return of void-valued expressions
SELinux: use do_each_thread as a proper do/while block
SELinux: remove unused and shadowed addrlen variable
SELinux: more user friendly unknown handling printk
selinux: change handling of invalid classes (Was: Re: 2.6.26-rc5-mm1 selinux whine)
SELinux: drop load_mutex in security_load_policy
SELinux: fix off by 1 reference of class_to_string in context_struct_compute_av
SELinux: open code sidtab lock
SELinux: open code load_mutex
SELinux: open code policy_rwlock
selinux: fix endianness bug in network node address handling
selinux: simplify ioctl checking
SELinux: enable processes with mac_admin to get the raw inode contexts
Security: split proc ptrace checking into read vs. attach
...
* 'for-linus' of git://git.alsa-project.org/alsa-kernel: (179 commits)
ALSA: Release v1.0.17
ALSA: correct kcalloc usage
ALSA: ALSA driver for SGI O2 audio board
ALSA: asoc: kbuild - only show menus for the current ASoC CPU platform.
ALSA: ALSA driver for SGI HAL2 audio device
ALSA: hda - Fix FSC V5505 model
ALSA: hda - Fix missing init for unsol events on micsense model
ALSA: hda - Fix internal mic vref pin setup
ALSA: hda: 92hd71bxx PC Beep
ALSA: HDA - HP dc7600 with pci sub IDs 0x103c/0x3011 belongs to hp-3013 model
ALSA: usb-audio: add some Yamaha USB MIDI quirks
ALSA: usb-audio: fix Yamaha KX quirk
ALSA: ASoC: Au12x0/Au1550 PSC Audio support
ALSA: Add Yamaha KX49 (USB MIDI controller) to usbquirks.h
ALSA: ASoC: pxa2xx-ac97: fix warning due to missing argument in fuction declaration
ALSA: tosa: fix compilation with new DAPM API
ALSA: wavefront - add const
ALSA: remove CONFIG_KMOD from sound
ALSA: Fix a const to non-const assignment in the Digigram VXpocket sound driver
ALSA: Fix a const pointer usage warning in the Digigram VX soundcard driver
...
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: (37 commits)
splice: fix generic_file_splice_read() race with page invalidation
ramfs: enable splice write
drivers/block/pktcdvd.c: avoid useless memset
cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack)
scsi: sr avoids useless buffer allocation
block: blk_rq_map_kern uses the bounce buffers for stack buffers
block: add blk_queue_update_dma_pad
DAC960: push down BKL
pktcdvd: push BKL down into driver
paride: push ioctl down into driver
block: use get_unaligned_* helpers
block: extend queue_flag bitops
block: request_module(): use format string
Add bvec_merge_data to handle stacked devices and ->merge_bvec()
block: integrity flags can't use bit ops on unsigned short
cmdfilter: extend default read filter
sg: fix odd style (extra parenthesis) introduced by cmd filter patch
block: add bounce support to blk_rq_map_user_iov
cfq-iosched: get rid of enable_idle being unused warning
allow userspace to modify scsi command filter on per device basis
...
Add Enclosure Management support to libata and ahci.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
ATA_TMOUT_INTERNAL which was 30secs were used for all internal
commands which is way too long when something goes wrong. This patch
implements command type based stepped timeouts. Different command
types can use different timeouts and each command type can use
different timeout values after timeouts.
ie. the initial timeout is set to a value which should cover most of
the cases but not too long so that run away cases don't delay things
too much. After the first try times out, the second try can use
longer timeout and if that one times out too, it can go for full 30sec
timeout.
IDENTIFYs use 5s - 10s - 30s timeout and all other commands use 5s -
10s timeouts.
This patch significantly cuts down the needed time to handle failure
cases while still allowing libata to work with nut job devices through
retries.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
EH retries were delayed by 5 seconds to ensure that resets don't occur
back-to-back. However, this 5 second delay is superflous or excessive
in many cases. For example, after IDENTIFY times out, there's no
reason to wait five more seconds before retrying.
This patch adds ehc->last_reset timestamp and record the timestamp for
the last reset trial or success and uses it to space resets by
ATA_EH_RESET_COOL_DOWN which is 5 secs and removes unconditional 5 sec
sleeps.
As this change makes inter-try waits often shorter and they're
redundant in nature, this patch also removes the "retrying..."
messages.
While at it, convert explicit rounding up division to DIV_ROUND_UP().
This change speeds up EH in many cases w/o sacrificing robustness.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
libata has been using mix of jiffies and msecs for time druations.
This is getting confusing. As writing sub HZ values in jiffies is
PITA and msecs_to_jiffies() can't be used as initializer, unify unit
for all time durations to msecs. So, durations are in msecs and
deadlines are in jiffies. ata_deadline() is added to compute deadline
from a start time and duration in msecs.
While at it, drop now superflous _msec suffix from arguments and
rename @timeout to @deadline if it represents a fixed point in time
rather than duration.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Multiple issues:
- there are no "default" values needed
- cw_min/cw_max can be larger than documented
- restructure to decrease size
- use get_unaligned_le16
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch makes mac80211 assign proper sequence numbers to
QoS-data frames. It also removes the old sequence number code
because we noticed that only the driver or hardware can assign
sequence numbers to non-QoS-data and especially management
frames in a race-free manner because beacons aren't passed
through mac80211's TX path.
This patch also adds temporary code to the rt2x00 drivers to
not break them completely, that code will have to be reworked
for proper sequence numbers on beacons.
It also moves sequence number assignment down in the TX path
so no sequence numbers are assigned to frames that are dropped.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ssb.h implements DMA mapping functions, so it should
include dma-mapping.h. This fixes compile failures on certain architectures.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch changes mac80211's beacon configuration handling
to never pass skbs to the driver directly but rather always
require the driver to use ieee80211_beacon_get(). Additionally,
it introduces "change flags" on the config_interface() call
to enable drivers to figure out what is changing. Finally, it
removes the beacon_update() driver callback in favour of
having IBSS beacon delivered by ieee80211_beacon_get() as well.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch implements the power management routines wireless extensions
for mac80211.
For now we only support switching PS mode between on and off.
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When switching a RFCOMM socket to a TTY, the remote modem status might
be needed later. Currently it is lost since the original configuration
is done via the socket interface. So store the modem status and reply
it when the socket has been converted to a TTY.
Signed-off-by: Denis Kenzior <denis.kenzior@trolltech.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Enable the common timestamp functionality that the network subsystem
provides for L2CAP, RFCOMM and SCO sockets. It is possible to either
use SO_TIMESTAMP or the IOCTLs to retrieve the timestamp of the
current packet.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
With the Simple Pairing support, the authentication requirements are
an explicit setting during the bonding process. Track and enforce the
requirements and allow higher layers like L2CAP and RFCOMM to increase
them if needed.
This patch introduces a new IOCTL that allows to query the current
authentication requirements. It is also possible to detect Simple
Pairing support in the kernel this way.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Bluetooth technology introduces new features on a regular basis
and for some of them it is important that the hardware on both sides
support them. For features like Simple Pairing it is important that
the host stacks on both sides have switched this feature on. To make
valid decisions, a config stage during ACL link establishment has been
introduced that retrieves remote features and if needed also the remote
extended features (known as remote host features) before signalling
this link as connected.
This change introduces full reference counting of incoming and outgoing
ACL links and the Bluetooth core will disconnect both if no owner of it
is present. To better handle interoperability during the pairing phase
the disconnect timeout for incoming connections has been increased to
10 seconds. This is five times more than for outgoing connections.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Simple Pairing process can only be used if both sides have the
support enabled in the host stack. The current Bluetooth specification
has three ways to detect this support.
If an Extended Inquiry Result has been sent during inquiry then it
is safe to assume that Simple Pairing is enabled. It is not allowed
to enable Extended Inquiry without Simple Pairing. During the remote
name request phase a notification with the remote host supported
features will be sent to indicate Simple Pairing support. Also the
second page of the remote extended features can indicate support for
Simple Pairing.
For all three cases the value of remote Simple Pairing mode is stored
in the inquiry cache for later use.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Simple Pairing feature is optional and needs to be enabled by the
host stack first. The Linux kernel relies on the Bluetooth daemon to
either enable or disable it, but at any time it needs to know the
current state of the Simple Pairing mode. So track any changes made
by external entities and store the current mode in the HCI device
structure.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
During the Simple Pairing process the HCI disconnect timer must be
disabled. The way to do this is by holding a reference count of the
HCI connection. The Simple Pairing process on both sides starts with
an IO Capabilities Request and ends with Simple Pairing Complete.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Bluetooth specification supports the default link policy settings
on a per host controller basis. For every new connection the link
manager would then use these settings. It is better to use this instead
of bothering the controller on every connection setup to overwrite the
default settings.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The connection packet type can be changed after the connection has been
established and thus needs to be properly tracked to ensure that the
host stack has always correct and valid information about it.
On incoming connections the Bluetooth core switches the supported packet
types to the configured list for this controller. However the usefulness
of this feature has been questioned a lot. The general consent is that
every Bluetooth host stack should enable as many packet types as the
hardware actually supports and leave the decision to the link manager
software running on the Bluetooth chip.
When running on Bluetooth 2.0 or later hardware, don't change the packet
type for incoming connections anymore. This hardware likely supports
Enhanced Data Rate and thus leave it completely up to the link manager
to pick the best packet type.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Bluetooth specification allows to enable or disable the encryption
of an ACL link at any time by either the peer or the remote device. If
a L2CAP or RFCOMM connection requested an encrypted link, they will now
disconnect that link if the encryption gets disabled. Higher protocols
that don't care about encryption (like SDP) are not affected.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Recent tests with various Bluetooth headsets have shown that some of
them don't enforce authentication and encryption when connecting. All
of them leave it up to the host stack to enforce it. Non of them should
allow unencrypted connections, but that is how it is. So in case the
link mode settings require authentication and/or encryption it will now
also be enforced on outgoing RFCOMM connections. Previously this was
only done for incoming connections.
This support has a small drawback from a protocol level point of view
since the host stack can't really tell with 100% certainty if a remote
side is already authenticated or not. So if both sides are configured
to enforce authentication it will be requested twice. Most Bluetooth
chips are caching this information and thus no extra authentication
procedure has to be triggered over-the-air, but it can happen.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
This patch exports the 'sync_sb_inodes()' which is needed for
UBIFS because it has to force write-back from time to time.
Namely, the UBIFS budgeting subsystem forces write-back when
its pessimistic callculations show that there is no free
space on the media.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Add support for the MPC8536 process and MPC8536DS reference board. The
MPC8536 is an e500v2 based SoC which eTSEC, USB, SATA, PCI, and PCIe.
The USB and SATA IP blocks are similiar to those on the PQ2 Pro SoCs and
thus use the same drivers.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Adds a new scsi_device flag, start_stop_pwr_cond: If enabled, the sd
driver will not send plain START STOP UNIT commands but ones with the
power condition field set to 3 (standby) or 1 (active) respectively.
Some FireWire disk firmwares do not stop the motor if power condition is
zero. Or worse, they become unresponsive after a START STOP UNIT with
power condition = 0 and start = 0.
http://lkml.org/lkml/2008/4/29/704
This patch only adds the necessary code to sd_mod but doesn't activate
it. Follow-up patches to the FireWire drivers will add detection of
affected devices and enable the code for them.
I did not add power condition values to scsi_error.c::scsi_eh_try_stu()
for now. The three firmwares which suffer from above mentioned problems
do not need START STOP UNIT in the error handler, and they are not
adversely affected by START STOP UNIT with power condition = 0 and start
= 1 (like scsi_eh_try_stu() sends it if scsi_device.allow_restart is
enabled).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: Tino Keitel <tino.keitel@gmx.de>
Don't write conflicting data to EBIU_SDBCTL after the SDRAM is
configured. This can cause data corruption, since we might change SDRAM
row and column addressing modes.
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Most likely it is broken anyway because of the changes in memory
detection. Since we can't test it and there are probably better ways
that using a P390 card, remove support for it.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Move memory detection code to own file and also simplify it.
Also add an interface which can be called at any time to get the
current memory layout. This interface is needed by our kernel
internal system dumper.
Cc: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Cc: Michael Holzheu <holzheu@de.ibm.com>
Cc: Frank Munzert <munzert@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Some macros in pgtable.h access members from struct task_struct.
Currently always works since sched.h seems always to be included
before asm/pgtable.h. Unfortunately that is not anymore true with
Jeremy Fitzhardinge's ptep_modify_prot transaction abstraction patch.
So fix this.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add support for new micro code load of CEX2C and CEX2A adapters,
which uses different IDs. This patch just adds the IDs to the
existing drivers.
Signed-off-by: Ralph Wuerthner <ralph.wuerthner@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Now it is possible to specify additional kernel parameters on the IPL
command line using the IPL PARM option.
If the Linux system is already running, the new reipl sysfs attribute
'parm' can be used to change kernel parameters for the next reboot.
Examples:
IPL C PARM dasd=1234 root=/dev/dasda1
IPL 1234 PARM savesys=mylnxnss
echo "init=/bin/bash" > /sys/firmware/reipl/ccw/parm
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
The idle notifier chain consists of at most one element. So there's
no point in having a notifier chain. Remove it and directly call the
function.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch adds a driver for subchannels of type chsc.
A device /dev/chsc is created which may be used to issue ioctls to:
- obtain information about the machine's I/O configuration
- dynamically change the machine's I/O configuration via
asynchronous chsc commands
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
css_device_id exists, so use it for determining the right driver
(and add a match_flags which is always 1 for valid types).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
This interface makes it easy for drivers to register usage of different
I/O interruption subclasses without needing to worry about possible
other users of the same isc.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Enhance the adapter interruption API so that device drivers can
register a handler for a specific interruption subclass. This
will allow different device drivers to move to differently
prioritized subclasses in order to avoid congestion.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Replace the numeric values for I/O interruption subclass usage
with abstract definitions and collect them all in asm/isc.h.
This gives us a better overview of which iscs are actually used
and makes it possible to better spread out isc usage in the
future.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Add support for clock synchronization with the server time protocol.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Add the user_regset definitions for normal and compat processes, replace
the dump_regs core dump cruft with the generic CORE_DUMP_USER_REGSET and
replace binfmt_elf32.c with the generic compat_binfmt_elf.c implementation.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Provide functions which can be used to incrementally construct fcx
enabled I/O control blocks.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Provide functions for assembling and starting fcx enabled I/O request
blocks.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Extend the scsw data structure to the format required by fcx. Also
provide helper functions for easier access to fields which are present
in both the traditional as well as the modified format.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Add modalias and subchannel type attributes for all subchannels.
I/O subchannel specific attributes are now created in
io_subchannel_probe(). modalias and subchannel type are also
added to the uevent for the css bus. Also make the css modalias
known.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
This patch adds a new ALSA driver for the audio device found inside
most of the SGI O2 workstation. The hardware uses a SGI custom chip,
which feeds a AD codec chip.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
The register security hook is no longer required, as the capability
module is always registered. LSMs wishing to stack capability as
a secondary module should do so explicitly.
Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
The sb_get_mnt_opts() hook is unused, and is superseded by the
sb_show_options() hook.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: James Morris <jmorris@namei.org>
This patch causes SELinux mount options to show up in /proc/mounts. As
with other code in the area seq_put errors are ignored. Other LSM's
will not have their mount options displayed until they fill in their own
security_sb_show_options() function.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: James Morris <jmorris@namei.org>
Enable security modules to distinguish reading of process state via
proc from full ptrace access by renaming ptrace_may_attach to
ptrace_may_access and adding a mode argument indicating whether only
read access or full attach access is requested. This allows security
modules to permit access to reading process state without granting
full ptrace access. The base DAC/capability checking remains unchanged.
Read access to /proc/pid/mem continues to apply a full ptrace attach
check since check_mem_permission() already requires the current task
to already be ptracing the target. The other ptrace checks within
proc for elements like environ, maps, and fds are changed to pass the
read mode instead of attach.
In the SELinux case, we model such reading of process state as a
reading of a proc file labeled with the target process' label. This
enables SELinux policy to permit such reading of process state without
permitting control or manipulation of the target process, as there are
a number of cases where programs probe for such information via proc
but do not need to be able to control the target (e.g. procps,
lsof, PolicyKit, ConsoleKit). At present we have to choose between
allowing full ptrace in policy (more permissive than required/desired)
or breaking functionality (or in some cases just silencing the denials
via dontaudit rules but this can hide genuine attacks).
This version of the patch incorporates comments from Casey Schaufler
(change/replace existing ptrace_may_attach interface, pass access
mode), and Chris Wright (provide greater consistency in the checking).
Note that like their predecessors __ptrace_may_attach and
ptrace_may_attach, the __ptrace_may_access and ptrace_may_access
interfaces use different return value conventions from each other (0
or -errno vs. 1 or 0). I retained this difference to avoid any
changes to the caller logic but made the difference clearer by
changing the latter interface to return a bool rather than an int and
by adding a comment about it to ptrace.h for any future callers.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: James Morris <jmorris@namei.org>
With the coming of kernel based modesetting and the memory manager stuff,
the everything in one directory approach was getting very ugly and
starting to be unmanageable.
This restructures the drm along the lines of other kernel components.
It creates a drivers/gpu/drm directory and moves the hw drivers into
subdirectores. It moves the includes into an include/drm, and
sets up the unifdef for the userspace headers we should be exporting.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reduce sizeof struct file_lock by 8 on 64 bit builds allowing +1 objects
per slab in the file_lock_cache
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
* Strengthen the return type for the _node_to_cpumask_ptr to be
a const pointer. This adds compiler checking to insure that
node_to_cpumask_map[] is not changed inadvertently.
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: "akpm@linux-foundation.org" <akpm@linux-foundation.org>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Acked-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
RFC 4340, 7.7 specifies up to 6 bytes for the NDP Count option, whereas the code
is currently limited to up to 3 bytes. This seems to be a relict of an earlier
draft version and is brought up to date by the patch.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Now that IRQ2 is never made available to the I/O APIC, there is no need
to special-case it and mask as a workaround for broken systems. Actually,
because of the former, mask_IO_APIC_irq(2) is a no-op already.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
got this on a test-system:
calling numaq_tsc_disable+0x0/0x39
NUMAQ: disabling TSC
initcall numaq_tsc_disable+0x0/0x39 returned 0 after 0 msecs
that's because we should not be using arch_initcall to call numaq_tsc_disable.
need to call it in setup_arch before time_init()/tsc_init()
and call it in init_intel() to make the cpu feature bits right.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix FRV irqs_disabled() to return an int, not an unsigned long to avoid
this warning:
kernel/sched.c: In function '__might_sleep':
kernel/sched.c:8198: warning: format '%d' expects type 'int', but argument 3 has type 'long unsigned int'
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that the original SMC_USE_PXA_DMA specific code will always being
built if CONFIG_ARCH_PXA is defined, so to make this part of the code
to be PXA public, and still prevent it from being built if support of
PXA is not selected.
A SMC91X_USE_DMA flag is added to the platform data to allow platform
to choose its usage of DMA. Note this flag itself is so named to be
generic enough (assuming other platforms can also use DMA).
It keeps backward compatibility to set the SMC91X_USE_DMA flag if
SMC_USE_PXA_DMA is still defined.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Acked-by: Nicolas Pitre <nico@cam.org>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Now one can use the following code
#define SMC_IO_SHIFT lp->io_shift
to make SMC_IO_SHIFT a variable. This, however, will slightly increase
the CPU overhead and have negative impact on the network performance.
The tradeoff is, this can be specified in the smc91x platform data so
that multiple boards support can be built in a single zImage.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Acked-by: Nicolas Pitre <nico@cam.org>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
And also favors the usage of SMC91X_NOWAIT over the hardcoded SMC_NOWAIT
by converting "nowait" (module parameter overridable) to platform flag.
There are several possibilities:
1. platform data present - preferred and use as is
2. platform data absent - use "nowait", it can be:
a. SMC_NOWAIT if defined
b. default to 0 if SMC_NOWAIT isn't defined
c. overriden by module parameter
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Acked-by: Nicolas Pitre <nico@cam.org>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
IRQ trigger type can be specified in the IRQ resource definition by
IORESOURCE_IRQ_*, we need only one way to specify this.
This also fixes the following small issue:
To allow dynamic support for multiple platforms, when those relevant
macros are not defined for one specific platform, the default case
will be:
- SMC_DYNAMIC_BUS_CONFIG defined
- and SMC_IRQ_FLAGS = IRQF_TRIGGER_RISING
While if "irq_flags" is missing when defining the smc91x_platdata,
usually as follows:
static struct smc91x_platdata xxxx_smc91x_data = {
.flags = SMC91X_USE_XXBIT,
};
The lp->cfg.irq_flags will always be overriden by the above structure
(due to a memcpy), thus rendering lp->cfg.irq_flags to be "0" always.
(regardless of the default SMC_IRQ_FLAGS or IORESOURCE_IRQ_* flags)
Fixes this by forcing to use IORESOURCE_IRQ_* flags if present, and
make the only user of smc91x_platdata.irq_flags (renesas/migor) to
use IORESOURCE_IRQ_*.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Acked-by: Nicolas Pitre <nico@cam.org>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the accessor functions for the scsi_cmnd status from zfcp to the
SCSI include file. Change the interface to the functions to pass the
scsi_cmnd pointer instead of the status pointer.
Signed-off-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Adds support for target reset to SG_SCSI_RESET.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The SCSI Block Protocol uses this 16-bit CRC to verify the integrity
of each data sector. crc_t10dif() is used by sd_dif.c when performing
I/O to or from disks formatted with protection information.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Christoph objected to having sd.h in include/scsi since it is internal
to the sd driver. Move it to drivers/scsi/sd.h.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Let through upto the largest command of 260 defined by the scsi standard.
iscsi core supports this already. Now that the scsi-ml supports it we can
start using large commands.
[jejb:rejections fixed up]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The recv lock was defined so the iscsi layer could block
the recv path from processing IO during recovery. It
turns out iser just set a lock to that pointer which was pointless.
We now disconnect the transport connection before doing recovery
so we do not need the recv lock. For iscsi_tcp we still stop
the recv path incase older tools are being used.
This patch also has iscsi_itt_to_ctask user grab the session lock
and has the caller access the task with the lock or get a ref
to it in case the target is broken and sends a tmf success response
then sends data or a response for the command that was supposed to
be affected bty the tmf.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Drivers expect that the cmds_max value they pass to the iscsi layer
is the max scsi commands + mgmt tasks. This patch implements that
and fixes some checks for nr cmd limits.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This adds two new attrs used for creating initiator ports and
binding sessions to hardware.
The session level initiatorname:
Since bnx2i does a scsi_host per host device, we need to add the
iface initiator port settings on the session, so we can create
multiple initiator ports (each with different inames) per device/scsi_host.
The current iname reflects that qla4xxx can have one iname per hba, and we are
allocating a host per session for software. The iname on the host will
remain so we can export and set the hba level qla4xxx setting.
The ifacename attr:
To bind a session to a some peice of hardware in userspace we maintain
some mappings, but during boot or iscsid restart (iscsid contains the user
space part of the driver) we need to be able to figure out which of those
host mappings abstractions maps to certain sessions. This patch adds
a ifacename attr, which userspace can set to id the host side of the
endpoint across pivot_roots and iscsid restarts.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Add sysfs representation for the endpoint, so userspace can match the
host and session to the endpoint. This will allow us to set the host's
parent correctly at host creation time.
The next patches will convert tcp and iser, and fix iser's dma_mask
bug.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Currently we duplicate the list of sessions, because we were using the
test for if a session was on the host list to indicate if the session
was bound or unbound. We can instead use the target_id and fix up
the class so that drivers like bnx2i do not have to manage the target id
space.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This is the second part of the iscsi task merging, and
all it does it rename iscsi_cmd_task to iscsi_task and
mtask/ctask to just task.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There is no need to have the mgmt and cmd tasks separate
structs. It used to save a lot of memory when we overprealocated
memory for tasks, but the next patches will set up the
driver so in the future they can use a mempool or some other
common scsi command allocator and common tagging.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Currently to get a ctask from the session cmd array, you have to
know to use the itt modifier. To make this easier on LLDs and
so in the future we can easilly kill the session array and use
the host shared map instead, this patch adds a nice wrapper
to strip the itt into a session->cmds index and return a ctask.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Some drivers want to be able to just pass in the driver data pointers
to the iscsi objects. To enable this we need the iscsi printk macro
to cast the object.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This removes the session and conn data_size fields from the iscsi_transport.
Just pass in the value like with host allocation. This patch also makes
it so the LLD iscsi_conn data is allocated with the iscsi_cls_conn.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This finishes the host/session unbinding, by adding some helpers
to add and remove hosts and the session they manage.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
bnx2i allocates a host per netdevice but will use libiscsi,
so this unbinds the session from the host in that code.
This will also be useful for the iser parent device dma settings
fixes.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This renames the iscsi_host to iscsi_cls_host to match the other
structs, because libiscsi wants to use the iscsi_host name in
the future.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
max_cmd_len and max_conn are not really used. max_cmd_len is
always 16 and can be set by the LLD. max_conn is always one
since we do not support MCS.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
iscsi offload (bnx2i and qla4xx) allocate a scsi host per hba,
so the session creation path needs a shost/host_no argument.
Software iscsi/iser will follow the same behabior as before
where it allcoates a host per session, but in the future iser
will probably look more like bnx2i where the host's parent is
the hardware (rnic for iser and for bnx2i it is the nic), because
it does not use a socket layer like how iscsi_tcp does.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Fix size of LDT entries. On x86-64, ldt_desc is a double-sized descriptor.
Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Filesystems like ext4 needs to start a new transaction in
the writepages for block allocation. This happens with delayed
allocation and there is limit to how many credits we can request
from the journal layer. So we call write_cache_pages multiple
times with wbc->nr_to_write set to the maximum possible value
limitted by the max journal credits available.
Add a new mode to writeback that enables us to handle this
behaviour. In the new mode we update the wbc->range_start
to point to the new offset to be written. Next call to
call to write_cache_pages will start writeout from specified
range_start offset. In the new mode we also limit writing
to the specified wbc->range_end.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Delayed allocation need to check free blocks at every write time.
percpu_counter_read_positive() is not quit accurate. delayed
allocation need a more accurate accounting, but using
percpu_counter_sum_positive() is frequently is quite expensive.
This patch added a new function to update center counter when sum
per-cpu counter, to increase the accurate rate for next
percpu_counter_read() and require less calling expensive
percpu_counter_sum().
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Export mpage_bio_submit() and __mpage_writepage() for the benefit of
ext4's delayed allocation support. Also change __block_write_full_page
so that if buffers that have the BH_Delay flag set it will call
get_block() to get the physical block allocated, just as in the
!BH_Mapped case.
Signed-off-by: Alex Tomas <alex@clusterfs.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This patch adds necessary framework into JBD2 to be able to track inodes
with each transaction and write-out their dirty data during transaction
commit time.
This new ordered mode brings all sorts of advantages such as possibility
to get rid of journal heads and buffer heads for data buffers in ordered
mode, better ordering of writes on transaction commit, simplification of
some JBD code, no more anonymous pages when truncate of data being
committed happens. Also with this new ordered mode, delayed allocation
on ordered mode is much simpler.
Signed-off-by: Jan Kara <jack@suse.cz>
Make filemap_fdatawrite_range() function public, so that it can later
be used in ordered mode rewrite by JBD/JBD2.
Signed-off-by: Jan Kara <jack@suse.cz>
Carlo Wood has demonstrated that it's possible to recover deleted
files from the journal. Something that will make this easier is if we
can put the time of the commit into commit block.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
In dwarf2_32.h, test for CONFIG_AS_CFI instead of
CONFIG_UNWIND_INFO. Turns out that searching for UNWIND_INFO
returns no match in any Kconfig or Makefile, so we're really
just throwing everything away regarding dwarf frames for i386.
The test that generates CONFIG_AS_CFI does not have anything
x86_64-specific, and right now, checking V=1 builds shows me
that the flags is there anyway, although unused.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In dwarf_64.h header, use the "ignore" macro the way
i386 does.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
i spent a fair amount of time chasing a 64-bit bootup crash that manifested
itself as bootup segfaults:
S10network[1825]: segfault at 7f3e2b5d16b8 ip 00000031108748c9 sp 00007fffb9c14c70 error 4 in libc-2.7.so[3110800000+14d000]
eventually causing init to die and panic the system:
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: init Not tainted 2.6.26-rc9-tip #13878
after a maratonic bisection session, the bad commit turned out to be:
| b7675791859075418199c7af86a116ea34eaf5bd is first bad commit
| commit b7675791859075418199c7af86a116ea34eaf5bd
| Author: Jeremy Fitzhardinge <jeremy@goop.org>
| Date: Wed Jun 25 00:19:00 2008 -0400
|
| x86: remove open-coded save/load segment operations
|
| This removes a pile of buggy open-coded implementations of savesegment
| and loadsegment.
after some more bisection of this patch itself, it turns out that what
makes the difference are the savesegment() changes to __switch_to().
Taking a look at this portion of arch/x86/kernel/process_64.o revealed
this crutial difference:
| good: 99c: 8c e0 mov %fs,%eax
| 99e: 89 45 cc mov %eax,-0x34(%rbp)
|
| bad: 99c: 8c 65 cc mov %fs,-0x34(%rbp)
which is due to:
| unsigned fsindex;
| - asm volatile("movl %%fs,%0" : "=r" (fsindex));
| + savesegment(fs, fsindex);
savesegment() is implemented as:
#define savesegment(seg, value) \
asm("mov %%" #seg ",%0":"=rm" (value) : : "memory")
note the "m" modifier - it allows GCC to generate the segment move
into a memory operand as well.
But regarding segment operands there's a subtle detail in the x86
instruction set: the above 16-bit moves are zero-extend, but only
if it goes to a register.
If it goes to a memory operand, -0x34(%rbp) in the above case, there's
no zero-extend to 32-bit and the instruction will only save 16 bits
instead of the intended 32-bit.
The other 16 bits is random data - which can cause problems when that
value is used later on.
The solution is to only allow segment operands to go to registers.
This fix allows my test-system to boot up without crashing.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Working with ftrace I would get large jumps of 11 millisecs or more with
the clock tracer. This killed the latencing timings of ftrace and also
caused the irqoff self tests to fail.
What was happening is with NO_HZ the idle would stop the jiffy counter and
before the jiffy counter was updated the sched_clock would have a bad
delta jiffies to compare with the gtod with the maximum.
The jiffies would stop and the last sched_tick would record the last gtod.
On wakeup, the sched clock update would compare the gtod + delta jiffies
(which would be zero) and compare it to the TSC. The TSC would have
correctly (with a stable TSC) moved forward several jiffies. But because the
jiffies has not been updated yet the clock would be prevented from moving
forward because it would appear that the TSC jumped too far ahead.
The clock would then virtually stop, until the jiffies are updated. Then
the next sched clock update would see that the clock was very much behind
since the delta jiffies is now correct. This would then jump the clock
forward by several jiffies.
This caused ftrace to report several milliseconds of interrupts off
latency at every resume from NO_HZ idle.
This patch adds hooks into the nohz code to disable the checking of the
maximum clock update when nohz is in effect. It resumes the max check
when nohz has updated the jiffies again.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It has been suggested that I add a way to disable the function tracer
on an oops. This code adds a ftrace_kill_atomic. It is not meant to be
used in normal situations. It will disable the ftrace tracer, but will
not perform the nice shutdown that requires scheduling.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add pseudo-feature bits to describe whether the CPU supports sysenter
and/or syscall from ia32-compat userspace. This removes a hardcoded
test in vdso32-setup.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Rename it to sb_start to make sure all users have been converted.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: Neil Brown <neilb@suse.de>
As other IOMMUs do, this puts dummy pci_swiotlb_init() in swiotlb.h
and remove ifdef CONFIG_SWIOTLB in pci-dma.c.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Our way to handle gart_* functions for CONFIG_GART_IOMMU and
!CONFIG_GART_IOMMU cases is inconsistent.
We have some dummy gart_* functions in !CONFIG_GART_IOMMU case and
also use ifdef CONFIG_GART_IOMMU tricks in pci-dma.c to call some
gart_* functions in only CONFIG_GART_IOMMU case.
This patch removes ifdef CONFIG_GART_IOMMU in pci-dma.c and always use
dummy gart_* functions in iommu.h.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
gart.h has only GART-specific stuff. Only GART code needs it. Other
IOMMU stuff should include iommu.h instead of gart.h.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
when more than 4g memory is installed, don't map the big hole below 4g.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Adds netif_napi_del function which is used to remove the napi struct from
the netdev napi_list in cases where CONFIG_NETPOLL was enabled.
The motivation for adding this is to handle the case in which the number of
queues on a device changes due to a configuration change. Previously the
napi structs for each queue would be left in the list until the netdev was
freed.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
tun: Persistent devices can get stuck in xoff state
xfrm: Add a XFRM_STATE_AF_UNSPEC flag to xfrm_usersa_info
ipv6: missed namespace context in ipv6_rthdr_rcv
netlabel: netlink_unicast calls kfree_skb on error path by itself
ipv4: fib_trie: Fix lookup error return
tcp: correct kcalloc usage
ip: sysctl documentation cleanup
Documentation: clarify tcp_{r,w}mem sysctl docs
netfilter: nf_nat_snmp_basic: fix a range check in NAT for SNMP
netfilter: nf_conntrack_tcp: fix endless loop
libertas: fix memory alignment problems on the blackfin
zd1211rw: stop beacons on remove_interface
rt2x00: Disable synchronization during initialization
rc80211_pid: Fix fast_start parameter handling
sctp: Add documentation for sctp sysctl variable
ipv6: fix race between ipv6_del_addr and DAD timer
irda: Fix netlink error path return value
irda: New device ID for nsc-ircc
irda: via-ircc proper dma freeing
sctp: Mark the tsn as received after all allocations finish
...
Add a XFRM_STATE_AF_UNSPEC flag to handle the AF_UNSPEC behavior for
the selector family. Userspace applications can set this flag to leave
the selector family of the xfrm_state unspecified. This can be used
to to handle inter family tunnels if the selector is not set from
userspace.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
NR_IRQS: let VISWS be just a sub-case of the generic code.
This can create a somewhat larger irq_desc[] array if NR_CPUS is high
but that should not worry VisWS which has 4 CPUs at most.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
move the SGIVW definitions from setup_arch.h into its own header file.
preparation for turning VISWS into a generic PC architecture.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
move the include/asm-x86/mach-visws/ VISWS specific hardware
details include files into include/asm-x86/visws, to be used from
generic code.
No code changed.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
update asm-x86/mach-visws/mach_apicdef.h to the generic version.
This should work fine as VISWS has a standard local APIC and thus
its mach_apicdef.h copy is just an ancient version of the generic code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
now that include/asm-x86/mach-visws/smpboot_hooks.h equals
to the default file in ../mach-default/smpboot_hooks.h, simply
include it instead of maintaining a copy.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
update include/asm-x86/mach-visws/smpboot_hooks.h to
include/asm-x86/mach-default/smpboot_hooks.h (the generic version).
this _should_ work, because VISWS sets skip_ioapic_setup, but it
should be tested on a real VISWS to make sure.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Allow the generic smpboot quirks code to be built with
ONFIG_X86_IO_APIC disabled. This way VISWS will be able
to use it as-is.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
now that include/asm-x86/mach-visws/mach_apic.h equals
to include/asm-x86/mach-default/mach_apic.h, simply start
using the generic one.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add early quirks support.
In preparation of enabling the generic architecture to boot on a VISWS.
This will allow us to remove the VISWS subarch and all its complications.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Also add some white space for a little clarity.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide a helper to load the file and validate it in one call, to
simplify error handling in the drivers which are going to use it.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Some devices need their firmware as a set of {address, len, data...}
records in some specific order rather than a simple blob.
The normal way of doing this kind of thing is 'ihex', which is a text
format and not entirely suitable for use in the kernel.
This provides a binary representation which is very similar, but much
more compact -- and a helper routine to skip to the next record,
because the alignment constraints mean that everybody will screw it up
for themselves otherwise.
Also a helper function which can verify that a 'struct firmware'
contains a valid set of ihex records, and that following them won't run
off the end of the loaded data.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Some drivers have their own hacks to bypass the kernel's firmware loader
and build their firmware into the kernel; this renders those unnecessary.
Other drivers don't use the firmware loader at all, because they always
want the firmware to be available. This allows them to start using the
firmware loader.
A third set of drivers already use the firmware loader, but can't be
used without help from userspace, which sometimes requires an initrd.
This allows them to work in a static kernel.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
In preparation for supporting firmware files linked into the static
kernel, make fw->data const to ensure that users aren't modifying it (so
that we can pass a pointer to the original in-kernel copy, rather than
having to copy it).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Allow choosing the bits in UP2OCR_SEOS.
Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com>
Acked-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add the .depth field to pxafb_mode_info and use it to set pixel data format
as 18(RGB666), 19(RGBT666), 24(RGB888) or 25(RGBT888)
Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com>
Acked-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add missing depth definitions to LCCR3.
Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com>
Acked-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
All new crypto interfaces should go into individual files as much
as possible in order to ensure that crypto.h does not collapse under
its own weight.
This patch moves the ahash code into crypto/hash.h and crypto/internal/hash.h
respectively.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the walking helpers for hash algorithms akin to
those of block ciphers. This is a necessary step before we can
reimplement existing hash algorithms using the new ahash interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The base field in ahash_tfm appears to have been cut-n-pasted from
ablkcipher. It isn't needed here at all. Similarly, the info field
in ahash_request also appears to have originated from its cipher
counter-part and is vestigial.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
PalmTX is PXA27x based device with wifi, bluetooth,
touchscreen, sdio slot, irda, keypad, nand flash,
pxa framebuffer, serial and usb gadget interface.
Supported by this patch is pxafb, touchscreen, irda,
keypad and sdio slot.
Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Support for the at91sam9g20 : Atmel 400Mhz ARM 926ej-s SOC.
AT91sam9g20 is an evolution of the at91sam9260 with a faster clock
speed.
We created a new board for this device but based the chip support
directly on 9260 files with little updates.
Here is the chip page on Atmel wabsite:
http://atmel.com/dyn/products/product_card.asp?part_id=4337
Signed-off-by: Sedji Gaouaou <sedji.gaouaou@atmel.com>
Signed-off-by: Justin Waters <justin.waters@timesys.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
mach-default/entry_arch.h is exactly the same file as
mach-visws/entry_arch.h, so include the first from the second,
so that updates to the generic one get picked up by VISWS as well.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
arch/x86/kernel/built-in.o: In function `smp_intr_init':
(.init.text+0x49e2): undefined reference to `call_function_single_interrupt'
Caused by include/asm-x86/mach-visws/entry_arch.h getting out of sync
with the include/asm-x86/mach-default/entry_arch.h file it derives from.
Copy the default file over - next step will be to simply include the default
file.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This remove lots of duplications in iommu.h and gart.h.
The end result of this patch is:
- iommu.h is a header file for everyone related with IOMMUs.
- gart.h is the private header file. Only pci-gart_64.c and its friends
include it.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: fujita.tomonori@lab.ntt.co.jp
Signed-off-by: Ingo Molnar <mingo@elte.hu>
A bunch of things in alsa depend on CONFIG_KMOD,
use CONFIG_MODULES instead where the dependency
is needed at all.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
This patch adds several functions for DAI control and config
and replaces the current method of calling function pointers within
the DAI struct.
Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
This patch series merges struct snd_soc_codec_dai and struct
snd_soc_cpu_dai into struct snd_soc_dai in preparation for further
ASoC v2 patches.
This merger removes duplication in both DAI structures and simplifies
the API for other users.
Signed-off-by: Liam Girdwood <lg@opensource.wolfsonmicro.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
and let 64-bit to fall back to use fixmap too.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
arch/x86/kernel/built-in.o: In function `dmi_ignore_irq0_timer_override':
boot.c:(.init.text+0x3ea4): undefined reference to `force_mask_ioapic_irq_2'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch brings support for gpio/gpiolib framework to Intel IOP3xx
platforms.
Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
And also reserve 32 IRQs for the two GPIO expanders.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
A pxa3xx_set_nand_info() is also introduced to set the PXA3xx NAND
driver specific platform_data structure pointer.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Cc: Sergey Podstavin <spodstavin@ru.mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some boards want to change low power state of pins on-the-fly, this
function helps to facilitate that operation instead of switching
back-n-forth between two configurations with pxa2xx_mfp_config().
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Tested-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Some boards use UART other than FFUART for the console, E.g. Marvell
PXA3xx Form Factor Platform (aka Littleton) uses STUART. This patch
modifies the uncompress.h so that display of the uncompress message
is routed to the STUART.
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The include/asm-arm/arch-pxa/cm-x270.h is not used anymore. Remove it.
Signed-off-by: Mike Rapoport <mike@compulab.co.il>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add a function to dynamically allocate and register pxa2xx-spi platform
devices, to be used by PXA2xx and PXA3xx based systems. Switch pcm027 and
lubbock to use it.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Acked-by: Eric Miao <eric.miao@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As well as moving all the device declarations to a single one in devices.c
this causes all platforms to register the I/O and interrupt resources for
the AC97 controller.
Cc: eric miao <eric.miao@marvell.com>
Cc: Mike Rapoport <mike@compulab.co.il>
Cc: Lennert Buytenhek <buytenh@wantstofly.org>
Cc: Jürgen Schindele <linux@schindele.name>
Cc: Juergen Beisert <jbe@pengutronix.de>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Provide a set of functions to control state of pins dedicated to IrDA.
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This is some preliminary work to improve TLB management on SW loaded
TLB powerpc platforms. This introduce support for non-atomic PTE
operations in pgtable-ppc32.h and removes write back to the PTE from
the TLB miss handlers. In addition, the DSI interrupt code no longer
tries to fixup write permission, this is left to generic code, and
_PAGE_HWWRITE is gone.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Move the UDP/TCP default timeo/retrans settings for text mounts to
nfs_init_timeout_values(), which was were they were always being
initialised (and sanity checked) for binary mounts.
Document the default timeout values using appropriate #defines.
Ensure that we initialise and sanity check the transport protocols that
may have been specified by the user.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
To make nfs_parse_server_address() more generally useful, allow it to
accept input strings that are not terminated with '\0'.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Currently, if an unstable write completes, we cannot redirty the page in
order to reflect a new change in the page data until after we've sent a
COMMIT request.
This patch allows a page rewrite to proceed without the unnecessary COMMIT
step, putting it immediately back onto the dirty page list, undoing the
VM unstable write accounting, and removing the NFS_PAGE_TAG_COMMIT tag from
the NFS radix tree.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The fs/nfs/iostat.h header has definitions that were designed to be exposed
to user space. Move these definitions under include/linux so user space can
use the definitions in applications that read /proc/self/mountstats.
Also address a handful of coding style issues called out by checkpatch.pl in
fs/nfs/iostat.h.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
All instances are set to nfs_open(), so we should just remove the redundant
indirection. Ditto for the file_release op
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Try to make the comment here a little more clear and concise.
Also, this macro definition seems unnecessary.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The cl_chatty flag alows us to control whether a given rpc client leaves
"server X not responding, timed out"
messages in the syslog. Such messages make sense for ordinary nfs
clients (where an unresponsive server means applications on the
mountpoint are probably hanging), but not for the callback client (which
can fail more commonly, with the only result just of disabling some
optimizations).
Previously cl_chatty was removed, do to lack of users; reinstate it, and
use it for the nfsd's callback client.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
NFSv2 file locking currently fails the Connectathon tests, because the
calls to the VFS locking code do not return an EINVAL error if the
struct file_lock overflows the 32-bit boundaries.
The problem is due to the fact that we occasionally call helpers from
fs/locks.c in order to avoid RPC calls to the server when we know that a
local process holds the lock. These helpers are, of course, always
64-bit enabled, so EINVAL is not returned in cases when it would if
the call had gone to the NLM code.
For consistency, we therefore add support for a bounds-checking helper.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The problems are that, with the ACPI vs timer overring issue _fixed_,
after using the box for some time (between several seconds and 1 hour, at
random) processes get very high CPU loads (once I've got X using 107% of
the CPU, for example) and the system becomes unresponsive, as though there
were interrupts lost or something similar.
Andreas Herrman reproduced similar problems:
> Ok, now I've reproduced the stability problem.
> - Using tip/master,
> - reverting e38502eb8aa82314d5ab0eba45f50e6790dadd88 and
> - applying your patch from this posting
> http://marc.info/?l=linux-kernel&m=121539354224562&w=4
>
> Starting X, firefox, gimp, tuxpaint and doing some drawing in tuxpaint
> results in a slow system. Drawing is almost not possible anymore --
> Selections of new colors, cursors etc. is performed with huge delay
> if it's performed at all.
>
> BTW, the code sets up timer IRQ as Virtual Wire IRQ:
>
> Jul 8 14:57:58 kodscha IO-APIC (apicid-pin) 2-22, 2-23 not connected.
> Jul 8 14:57:58 kodscha ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
> Jul 8 14:57:58 kodscha ...trying to set up timer as Virtual Wire IRQ... works.
>
> and both INT0 and INT2 of IOAPIC are masked:
>
> Jul 8 14:57:58 kodscha NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect:
> Jul 8 14:57:58 kodscha 00 000 1 0 0 0 0 0 0 00
> Jul 8 14:57:58 kodscha 01 003 0 0 0 0 0 1 1 31
> Jul 8 14:57:58 kodscha 02 003 1 0 0 0 0 0 0 30
>
> I've also seen strange CPU utilization -- with syslog-ng:
>
> top - 15:33:06 up 35 min, 4 users, load average: 1.70, 0.68, 0.37
> Tasks: 64 total, 4 running, 60 sleeping, 0 stopped, 0 zombie
> Cpu0 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
> Cpu1 : 6.4%us, 87.2%sy, 0.0%ni, 5.8%id, 0.0%wa, 0.6%hi, 0.0%si, 0.0%st
> Mem: 895384k total, 283568k used, 611816k free, 35492k buffers
> Swap: 1959920k total, 0k used, 1959920k free, 163044k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 4632 root 20 0 17216 800 580 S 104 0.1 0:34.22 syslog-ng
> 28505 root 20 0 205m 11m 4024 S 6 1.3 0:21.16 X
> 28518 root 20 0 56292 5652 4492 S 1 0.6 0:01.80 fluxbox
> 1 root 20 0 3724 608 508 S 0 0.1 0:00.36 init
>
> So far I have no clue why C1E-idle in conjunction with virtual wire
> mode causes this strange behaviour.
>
> ... and I start to think about the root cause of all this.
>
> I've performed similar tests under X with the IRQ0/INT0 configuration and
> I did not see above symptoms.
So lets fall back to the IRQ0/INT0 configuration on this box.
This basically restores the dont-use-the-lapic-timer exception mechanism
that was unconditional on this box prior commit 8750bf5 ("x86: add C1E
aware idle function").
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Hmm, looks like it would be nice to have more cleanups of iommu.h and
gart.h.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When system have 4g less ram installed, and acpi table sit
near end of ram, make max_pfn cover them too,
so 64bit kernel don't need to mess up fixmap.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: "Suresh Siddha" <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Remove them from the arch-specific file.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86_64 does not need it, but it won't have X86_INTEL_USERCOPY
defined either.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We also carry the unaligned version with us. Only x86_64 uses
it, but there's no problem in defining it.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move both versions, which are highly similar, to uaccess.h.
Note that, for x86_64, X86_WP_WORKS_OK is always defined.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We also check user pointer in x86_64 put_user, the way i386 does.
In a separate patch for bisecting purposes.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For both __put_user_x and __put_user_8 macros, pass the error
variable explicitly.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move __get_user_asm and __get_user_size and __get_user_nocheck
to uaccess.h. This requires us to define a macro at __get_user_size
for the 64-bit access case.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Let the user of the macro specify the desired return.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Move both __put_user_asm and __put_user_size to
uaccess.h. i386 already had a special function for 64-bit access,
so for x86_64, we just define a macro with the same name.
Note that for X86_64, CONFIG_X86_WP_WORKS_OK will always
be defined, so the #else part will never be even compiled in.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Let the user of the macro specify the desired return.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Do it in a separate patch for bisectability.
Goal is to have put_user_size integrated.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Take it out of uaccess_32.h. Since it seems that no users
of the x86_64 exists, we simply pick the i386 version.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Merge versions of getuser from uaccess_32.h and uaccess_64.h into
uaccess.h. There is a part which is 64-bit only (for now), and for
that, we use a __get_user_8 macro.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Common parts of uaccess_32.h and uaccess_64.h
are put in uaccess.h. Bits in uaccess_32.h and
uaccess_64.h that come to this file are equal
except for comments and whitespaces differences.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Using explicit hexa (0xFFFFFFUL) introduces an unnecessary difference
between i386 and x86_64 because of the size of their long. Use -1UL instead.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Do not refer to the processor word-size with int, as it won't
work with x86_64. Use long instead.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Put the likely hint in access_ok. Just for
bisectability.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Our integration efforts broke a build with this function being used
with i386. Reason is "g" can put the operand in an imm32, which according
to The Book (tm), is invalid as the second operand.
This is actually a bug
in x86_64 too, since the x86_64 instruction set reference does not list
it as valid.
We probably didn't trigger this before due to the ammount of
registers available for 64-bit platforms. But that's just my guess.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
For i386, __range_not_ok is a better name than __range_ok, since
it returns 0 when it is in fact okay. Other than that,
both versions does not need the word size specifiers, and we remove them.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
In putuser_32.S and putuser_64.S, replace things like .quad, .long,
and explicit references to [r|e]ax for the apropriate macros
in asm/asm.h.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is consistent with i386 usage.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Instead of clobbering r8, clobber rbx, which is the i386 way.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Follow the pattern, and define a single put_user_x, instead
of defining macros for all available sizes. Exception is
put_user_8, since the "A" constraint does not give us enough
power to specify which register (a or d) to use in the 32-bit
common case.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Clobber it in the inline asm macros, and let the compiler do this for us.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
getuser_32.S and getuser_64.S are merged into getuser.S.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are situations in which the architecture wants to use the
register that represents its word-size, whatever it is. For those,
introduce __ASM_REG in asm.h, along with the first users _ASM_AX
and _ASM_DX. They have users waiting for it, namely the getuser
functions.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
There's really no reason to clobber r8 or pass the address in rcx.
We can safely use only two registers (which we already have to touch anyway)
to do the job.
Signed-off-by: Glauber Costa <gcosta@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Allow an application to enable Strong Access Ordering on specific pages of
memory on Power 7 hardware. Currently, power has a weaker memory model than
x86. Implementing a stronger memory model allows an emulator to more
efficiently translate x86 code into power code, resulting in faster code
execution.
On Power 7 hardware, storing 0b1110 in the WIMG bits of the hpte enables
strong access ordering mode for the memory page. This patchset allows a
user to specify which pages are thus enabled by passing a new protection
bit through mmap() and mprotect(). I have defined PROT_SAO to be 0x10.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add the CPU feature bit for the new Strong Access Ordering
facility of Power7
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Joel Schopp <jschopp@austin.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch defines:
- PROT_SAO, which is passed into mmap() and mprotect() in the prot field
- VM_SAO in vma->vm_flags, and
- _PAGE_SAO, the combination of WIMG bits in the pte that enables strong
access ordering for the page.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch allows architectures to define functions to deal with
additional protections bits for mmap() and mprotect().
arch_calc_vm_prot_bits() maps additonal protection bits to vm_flags
arch_vm_get_page_prot() maps additional vm_flags to the vma's vm_page_prot
arch_validate_prot() checks for valid values of the protection bits
Note: vm_get_page_prot() is now pretty ugly, but the generated code
should be identical for architectures that don't define additional
protection bits.
Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The task_pt_regs() macro allows access to the pt_regs of a given task.
This macro is not currently defined for the powerpc architecture, but
we need it for some upcoming utrace additions.
Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Move device_to_mask() to dma-mapping.h because we need to use it from
outside dma_64.c in a later patch.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Update powerpc to use the new dma_*map*_attrs() interfaces. In doing so
update struct dma_mapping_ops to accept a struct dma_attrs and propagate
these changes through to all users of the code (generic IOMMU and the
64bit DMA code, and the iseries and ps3 platform code).
The old dma_*map_*() interfaces are reimplemented as calls to the
corresponding new interfaces.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Make iommu_map_sg take a struct iommu_table. It did so before commit
740c3ce667 (iommu sg merging: ppc: make
iommu respect the segment size limits).
This stops the function looking in the archdata.dma_data for the iommu
table because in the future it will be called with a device that has
no table there.
This also has the nice side effect of making iommu_map_sg() match the
other map functions.
Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
As nr_active counter includes also spus waiting for syscalls to return
we need a seperate counter that only counts spus that are currently running
on spu side. This counter shall be used by a cpufreq governor that targets
a frequency dependent from the number of running spus.
Signed-off-by: Christian Krafft <krafft@de.ibm.com>
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
fix:
In file included from arch/x86/kernel/tlb_uv.c:14:
include/asm/uv/uv_mmrs.h:986: error: redefinition of ‘union uvh_rh_gam_cfg_overlay_config_mmr_u’
include/asm/uv/uv_mmrs.h:988: error: redefinition of ‘struct uvh_rh_gam_cfg_overlay_config_mmr_s’
include/asm/uv/uv_mmrs.h:1064: error: redefinition of ‘union uvh_rh_gam_mmioh_overlay_config_mmr_u’
include/asm/uv/uv_mmrs.h:1066: error: redefinition of ‘struct uvh_rh_gam_mmioh_overlay_config_mmr_s’
caused by another duplicate section (cut & paste error) in commit
5d061e397d "x86, uv: update x86 mmr list for SGI uv".
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
In file included from arch/x86/kernel/genx2apic_uv_x.c:25:
include/asm/uv/uv_mmrs.h:986: error: redefinition of ‘union uvh_rh_gam_cfg_overlay_config_mmr_u’
include/asm/uv/uv_mmrs.h:988: error: redefinition of ‘struct uvh_rh_gam_cfg_overlay_config_mmr_s’
include/asm/uv/uv_mmrs.h:1064: error: redefinition of ‘union uvh_rh_gam_mmioh_overlay_config_mmr_u’
include/asm/uv/uv_mmrs.h:1066: error: redefinition of ‘struct uvh_rh_gam_mmioh_overlay_config_mmr_s’
caused by duplicate section (cut & paste error) in commit
5d061e397d "x86, uv: update x86 mmr list for SGI uv".
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Accesses are mostly structured such that when there are multiple TX
queues the code transformations will be a little bit simpler.
Signed-off-by: David S. Miller <davem@davemloft.net>
Only plain netif_schedule() remains taking a net_device, mostly as a
compatability item while we transition the rest of these interfaces.
Everything else calls netif_schedule_queue() or __netif_schedule(),
both of which take a netdev_queue pointer.
Signed-off-by: David S. Miller <davem@davemloft.net>
First, we add a qdisc_tx_changing() helper which returns true if the
qdisc attachment is in transition.
Second, we remove an assertion warning which is of limited value and
is hard to express precisely in a multiqueue environment.
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a helper function, currently used by IRDA.
This is being added so that we can contain and isolate as many
explicit ->tx_queue references in the tree as possible.
Signed-off-by: David S. Miller <davem@davemloft.net>
Isolate callers that want to simply reset all the TX qdiscs from the
details of TX queues.
Use this in the ISDN code.
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that our qdisc management is bi-directional, per-queue, and fully
orthogonal, there is no reason to have a special ingress qdisc pointer
in struct net_device.
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the paravirtualized calculate_cpu_khz to calibrate_tsc.
In all cases, we actually calibrate_tsc and use that as the cpu_khz value.
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Signed-off-by: Dan Hecht <dhecht@vmware.com>
Cc: Dan Hecht <dhecht@vmware.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>