Commit Graph

26386 Commits

Author SHA1 Message Date
Paul Mundt
85247285ea sh: Use the common segment definitions for the _64 uaccess routines.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-07-28 18:10:29 +09:00
Paul Mundt
66dfe18114 sh: Add support for 16kB PAGE_SIZE.
16kB is a useful size on nommu, while 64kB still tends to be too big to
be useful. Newer MMUs are likely to support this as well, so plug it
in in anticipation of those, too.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-07-28 18:10:29 +09:00
Paul Mundt
02f7e627f9 sh: Consolidate segment modifiers across mmu/nommu systems.
This moves get_fs/set_fs() and friends in to asm/segment.h. The
mm_segment_t definition is likewise consolidated from the _32/_64 split.

This is prepatory groundwork for using the generic address space limit
and verification routines across mmu/nommu configs.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-07-28 18:10:28 +09:00
Paul Mundt
3bc24a1a54 sh: Initial ELF FDPIC support.
This adds initial support for ELF FDPIC on MMU-less SH, as per version
0.2 of the ABI definition at:

	http://www.codesourcery.com/public/docs/sh-fdpic/sh-fdpic-abi.txt

Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-07-28 18:10:28 +09:00
Khem Raj
82706b8f7b sh: Prevent leaking of CONFIG_SUPERH32 to userspace in asm/unistd.h.
CONFIG_SUPERH32 is currently trickling into userspace unistd.h. Attached
patch uses __SH5__ define in userspace.

Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2008-07-28 18:10:27 +09:00
Barry Naujok
9403540c06 dcache: Add case-insensitive support d_ci_add() routine
This add a dcache entry to the dcache for lookup, but changing the name
that is associated with the entry rather than the one passed in to the
lookup routine.

First, it sees if the case-exact match already exists in the dcache and
uses it if one exists. Otherwise, it allocates a new node with the new
name and splices it into the dcache.

Original code from ntfs_lookup in fs/ntfs/namei.c by Anton Altaparmakov.

Signed-off-by: Barry Naujok <bnaujok@sgi.com>
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Acked-by: Christoph Hellwig <hch@infradead.org>
2008-07-28 16:58:39 +10:00
Nick Piggin
83ac6a1ed4 powerpc/mm: Implement _PAGE_SPECIAL & pte_special() for 64-bit
Implement _PAGE_SPECIAL and pte_special() for 64-bit powerpc. This bit will
be used by the fast get_user_pages() to differenciate PTEs that correspond
to a valid struct page from special mappings that don't such as IO mappings
obtained via io_remap_pfn_ranges().

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-28 16:30:52 +10:00
Nathan Lynch
e9efed3b80 powerpc: Make core id information available to userspace
Existing Open Firmware practice is to report each processor core as a
separate node in the device tree.  Report the value of the "reg" OF
property corresponding to a logical CPU's device node as the core_id
attribute in /sys/devices/system/cpu/cpu*/topology/core_id.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-28 16:30:52 +10:00
Nathan Lynch
440a0857e3 powerpc: Make core sibling information available to userspace
Implement the notion of "core siblings" for powerpc.  This makes
/sys/devices/system/cpu/cpu*/topology/core_siblings present sensible
values, indicating online CPUs which share an L2 cache.

BenH: Made cpu_to_l2cache() use of_find_node_by_phandle() instead
of IBM-specific open coded search

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-28 16:30:51 +10:00
Roland McGrath
7d6d637dac powerpc: Add TIF_NOTIFY_RESUME support for tracehook
This adds TIF_NOTIFY_RESUME support for powerpc.  When set,
we call tracehook_notify_resume() on the way to user mode.
This overloads do_signal() to do the work, but changes its
arguments to it has the TIF_* bits handy in a register and
drops the useless first argument that was always zero.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-28 16:30:50 +10:00
Roland McGrath
f1ba12856b powerpc: Add asm/syscall.h with the tracehook entry points
Add asm/syscall.h for powerpc with all the required entry points.
This will allow arch-independent tracing code for system calls.

BenH: Fixed up use of regs->trap to properly mask low bit

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-28 16:30:49 +10:00
Kumar Gala
ff8dc7698c powerpc: Fix 8xx build failure
The 'powerpc ioremap_prot' broke 8xx builds:

include2/asm/pgtable-ppc32.h:555: error: '_PAGE_WRITETHRU' undeclared (first use in this function)
include2/asm/pgtable-ppc32.h:555: error: (Each undeclared identifier is reported only once
include2/asm/pgtable-ppc32.h:555: error: for each function it appears in.)

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-28 16:30:48 +10:00
Benjamin Herrenschmidt
d65d830ca0 Merge commit 'gcl/gcl-next' 2008-07-28 16:30:40 +10:00
Rusty Russell
eeec4fad96 stop_machine(): stop_machine_run() changed to use cpu mask
Instead of a "cpu" arg with magic values NR_CPUS (any cpu) and ~0 (all
cpus), pass a cpumask_t.  Allow NULL for the common case (where we
don't care which CPU the function is run on): temporary cpumask_t's
are usually considered bad for stack space.

This deprecates stop_machine_run, to be removed soon when all the
callers are dead.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-28 12:16:30 +10:00
Rusty Russell
ffdb5976c4 Simplify stop_machine
stop_machine creates a kthread which creates kernel threads.  We can
create those threads directly and simplify things a little.  Some care
must be taken with CPU hotunplug, which has special needs, but that code
seems more robust than it was in the past.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
2008-07-28 12:16:29 +10:00
Jason Baron
5c2aed6225 stop_machine: add ALL_CPUS option
-allow stop_mahcine_run() to call a function on all cpus. Calling
 stop_machine_run() with a 'ALL_CPUS' invokes this new behavior.
 stop_machine_run() proceeds as normal until the calling cpu has
 invoked 'fn'. Then, we tell all the other cpus to call 'fn'.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
CC: Adrian Bunk <bunk@stusta.de>
CC: Andi Kleen <andi@firstfloor.org>
CC: Alexey Dobriyan <adobriyan@gmail.com>
CC: Christoph Hellwig <hch@infradead.org>
CC: mingo@elte.hu
CC: akpm@osdl.org
2008-07-28 12:16:28 +10:00
Mauro Carvalho Chehab
c2f90e9536 Merge ../linux-2.6 2008-07-27 22:23:18 -03:00
David S. Miller
5cfc177666 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sam/sparc 2008-07-27 17:09:02 -07:00
Linus Torvalds
3e318b5b55 Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] Fix shared mmap when more than two maps of the same file exist
  [ARM] fix VIPT/VIVT macro optimisations, add comments
  [ARM] 5179/1: Replace obsolete IRQT_* and __IRQT_* values with IRQ_TYPE_*
  [ARM] update defconfig for eseries.
  [ARM] PXA: squash warning in pxafb
  [ARM] pxa: PXA25x UDC - Fix warning during build
  [ARM] fix nwflash.c: 6ee8928d94
  [ARM] fix IOP32x, IOP33x, MXC and Samsung builds
  [ARM] pci: provide dummy pci_get_legacy_ide_irq()
  [ARM] fix fls() for 64-bit arguments
  [ARM] fix mode for board-yl-9200.c
  [ARM] 5176/1: arm/Makefile: fix: ARM946T -> ARM946E
2008-07-27 16:46:08 -07:00
Andrea Righi
940389b8af task IO accounting: move all IO statistics in struct task_io_accounting
Simplify the code of include/linux/task_io_accounting.h.

It is also more reasonable to have all the task i/o-related statistics in a
single struct (task_io_accounting).

Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-27 16:12:28 -07:00
Mauro Carvalho Chehab
eb703027ac Merge ../linux-2.6 2008-07-27 18:11:53 -03:00
Sam Ravnborg
a439fe51a1 sparc, sparc64: use arch/sparc/include
The majority of this patch was created by the following script:

***
ASM=arch/sparc/include/asm
mkdir -p $ASM
git mv include/asm-sparc64/ftrace.h $ASM
git rm include/asm-sparc64/*
git mv include/asm-sparc/* $ASM
sed -ie 's/asm-sparc64/asm/g' $ASM/*
sed -ie 's/asm-sparc/asm/g' $ASM/*
***

The rest was an update of the top-level Makefile to use sparc
for header files when sparc64 is being build.
And a small fixlet to pick up the correct unistd.h from
sparc64 code.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-27 23:00:59 +02:00
Linus Torvalds
837b41b5de Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  firewire: state userland requirements in Kconfig help
  firewire: avoid memleak after phy config transmit failure
  firewire: fw-ohci: TSB43AB22/A dualbuffer workaround
  firewire: queue the right number of data
  firewire: warn on unfinished transactions during card removal
  firewire: small fw_fill_request cleanup
  firewire: fully initialize fw_transaction before marking it pending
  firewire: fix race of bus reset with request transmission
2008-07-27 10:24:06 -07:00
Linus Torvalds
211c8d4942 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (59 commits)
  [SCSI] replace __FUNCTION__ with __func__
  [SCSI] extend the last_sector_bug flag to cover more sectors
  [SCSI] qla2xxx: Update version number to 8.02.01-k6.
  [SCSI] qla2xxx: Additional NPIV corrections.
  [SCSI] qla2xxx: suppress uninitialized-var warning
  [SCSI] qla2xxx: use memory_read_from_buffer()
  [SCSI] qla2xxx: Issue proper ISP callbacks during stop-firmware.
  [SCSI] ch: fix ch_remove oops
  [SCSI] 3w-9xxx: add MSI support and misc fixes
  [SCSI] scsi_lib: use blk_rq_tagged in scsi_request_fn
  [SCSI] ibmvfc: Update driver version to 1.0.1
  [SCSI] ibmvfc: Add ADISC support
  [SCSI] ibmvfc: Miscellaneous fixes
  [SCSI] ibmvfc: Fix hang on module removal
  [SCSI] ibmvfc: Target refcounting fixes
  [SCSI] ibmvfc: Reduce unnecessary log noise
  [SCSI] sym53c8xx: free luntbl in sym_hcb_free
  [SCSI] scsi_scan.c: Release mutex in error handling code
  [SCSI] scsi_eh_prep_cmnd should save scmd->underflow
  [SCSI] sd: Support for SCSI disk (SBC) Data Integrity Field
  ...
2008-07-27 10:04:52 -07:00
Linus Torvalds
7a82323da3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
  avr32: some mmc/sd cleanups
  include/video/atmel_lcdc.h must #include <linux/workqueue.h>
  avr32: allow system timer to share interrupt to make OProfile work
  drivers/misc/atmel-ssc.c: Removed duplicated include
  avr32: Add platform data for AC97C platform device
  avr32: clean up mci platform code
  fix avr32 build errors
2008-07-27 10:03:00 -07:00
Linus Torvalds
b0d8aa081b Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
  KVM: ppc: fix invalidation of large guest pages
  KVM: s390: Fix possible host kernel bug on lctl(g) handling
  KVM: s390: Fix instruction naming for lctlg
  KVM: s390: Fix program check on interrupt delivery handling
  KVM: s390: Change guestaddr type in gaccess
  KVM: s390: Fix guest kconfig
  KVM: s390: Advertise KVM_CAP_USER_MEMORY
  KVM: ia64: Fix irq disabling leak in error handling code
  KVM: VMX: Fix undefined beaviour of EPT after reload kvm-intel.ko
  KVM: VMX: Fix bypass_guest_pf enabling when disable EPT in module parameter
  KVM: task switch: translate guest segment limit to virt-extension byte granular field
  KVM: Avoid instruction emulation when event delivery is pending
  KVM: task switch: use seg regs provided by subarch instead of reading from GDT
  KVM: task switch: segment base is linear address
  KVM: SVM: allow enabling/disabling NPT by reloading only the architecture module
2008-07-27 10:00:23 -07:00
Linus Torvalds
6948385cbd Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (25 commits)
  setlocalversion: do not describe if there is nothing to describe
  kconfig: fix typos: "Suport" -> "Support"
  kconfig: make defconfig is no longer chatty
  kconfig: make oldconfig is now less chatty
  kconfig: speed up all*config + randconfig
  kconfig: set all new symbols automatically
  kconfig: add diffconfig utility
  kbuild: remove Module.markers during mrproper
  kbuild: sparse needs CF not CHECKFLAGS
  kernel-doc: handle/strip __init
  vmlinux.lds: move __attribute__((__cold__)) functions back into final .text section
  init: fix URL of "The GNU Accounting Utilities"
  kbuild: add arch/$ARCH/include to search path
  kbuild: asm symlink support for arch/$ARCH/include
  kbuild: support arch/$ARCH/include for tags, cscope
  kbuild: prepare headers_* for arch/$ARCH/include
  kbuild: install all headers when arch is changed
  kbuild: make clean removes *.o.* as well
  kbuild: optimize headers_* targets
  kbuild: only one call for include/ in make headers_*
  ...
2008-07-27 09:59:59 -07:00
Andrea Righi
5995477ab7 task IO accounting: improve code readability
Put all i/o statistics in struct proc_io_accounting and use inline functions to
initialize and increment statistics, removing a lot of single variable
assignments.

This also reduces the kernel size as following (with CONFIG_TASK_XACCT=y and
CONFIG_TASK_IO_ACCOUNTING=y).

    text    data     bss     dec     hex filename
   11651       0       0   11651    2d83 kernel/exit.o.before
   11619       0       0   11619    2d63 kernel/exit.o.after
   10886     132     136   11154    2b92 kernel/fork.o.before
   10758     132     136   11026    2b12 kernel/fork.o.after

 3082029  807968 4818600 8708597  84e1f5 vmlinux.o.before
 3081869  807968 4818600 8708437  84e155 vmlinux.o.after

Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Acked-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-27 09:58:20 -07:00
Al Viro
eeb61f719c missing bits of net-namespace / sysctl
Piss-poor sysctl registration API strikes again, film at 11...

What we really need is _pathname_ required to be present in already
registered table, so that kernel could warn about bad order.  That's the
next target for sysctl stuff (and generally saner and more explicit
order of initialization of ipv[46] internals wouldn't hurt either).

For the time being, here are full fixups required by ..._rotable()
stuff; we make per-net sysctl sets descendents of "ro" one and make sure
that sufficient skeleton is there before we start registering per-net
sysctls.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-27 09:45:34 -07:00
Mauro Carvalho Chehab
50cb993ea6 Merge ../linux-2.6 2008-07-27 12:25:57 -03:00
Mauro Carvalho Chehab
9fa0f6db3a V4L/DVB (8522): videodev2: Fix merge conflict
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-27 12:24:02 -03:00
Alan Jenkins
2b14290078 [SCSI] extend the last_sector_bug flag to cover more sectors
The last_sector_bug flag was added to work around a bug in certain usb
cardreaders, where they would crash if a multiple sector read included the
last sector. The original implementation avoids this by e.g. splitting an 8
sector read which includes the last sector into a 7 sector read, and a single
sector read for the last sector.  The flag is enabled for all USB devices.

This revealed a second bug in other usb cardreaders, which crash when they
get a multiple sector read which stops 1 sector short of the last sector.
Affected hardware includes the Kingston "MobileLite" external USB cardreader
and the internal USB cardreader on the Asus EeePC.

Extend the last_sector_bug workaround to ensure that any access which touches
the last 8 hardware sectors of the device is a single sector long.  Requests
are shrunk as necessary to meet this constraint.

This gives us a safety margin against potential unknown or future bugs
affecting multi-sector access to the end of the device.  The two known bugs
only affect the last 2 sectors.  However, they suggest that these devices
are prone to fencepost errors and that multi-sector access to the end of the
device is not well tested.  Popular OS's use multi-sector accesses, but they
rarely read the last few sectors.  Linux (with udev & vol_id) automatically
reads sectors from the end of the device on insertion.  It is assumed that
single sector accesses are more thoroughly tested during development.

Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Tested-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-27 10:16:13 -04:00
Hans Verkuil
de1e575db2 V4L/DVB (8525): fix a few assorted spelling mistakes.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-27 11:07:13 -03:00
Hans Verkuil
c1d7f4f164 V4L/DVB (8524): videodev: copy the VID_TYPE defines to videodev.h
The VID_TYPE defines are V4L1 specific, so copy them back to videodev.h.
In videodev2.h ensure that they are not used in the kernel (you need
to include videodev.h instead) and mark them are deprecated.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-27 11:07:12 -03:00
Hans Verkuil
0ea6bc8d43 V4L/DVB (8523): v4l2-dev: remove unused type and type2 field from video_device
The type and type2 fields were unused and so could be removed.
Instead add a vfl_type field that contains the type of the video
device.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-27 11:07:10 -03:00
Jean-Francois Moine
1250ac6d4a V4L/DVB (8518): gspca: Remove the remaining frame decoding functions from the subdrivers.
SPCA505 and SPCA508 added in the pixel formats.
Decode functions and associated resources removed in spca505, 506 and 508.
The decode routines are now found in the V4L library.

Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-27 11:06:42 -03:00
Mauro Carvalho Chehab
9993e51c0c V4L/DVB (8502): videodev2.h: CodingStyle cleanups
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-27 11:06:20 -03:00
Haavard Skinnemoen
eda3d8f560 Merge commit 'upstream/master' 2008-07-27 13:54:08 +02:00
Russell King
daf93dd55c [ARM] fix VIPT/VIVT macro optimisations, add comments
cacheflush.h was doing:

... VIVT only stuff
... VIPT only stuff
... VIVT or VIPT stuff

which is clearly bogus - we would only ever use the "VIVT or VIPT" case
when both VIVT and VIPT are not selected.  Fix this.

Add comments to each case, including noting the impossibility of
correctly detecting the cache type of ARM926 and ARMv6 cores from
the cache type register in the "VIVT or VIPT" case.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-07-27 10:10:58 +01:00
Hollis Blanchard
cc04454fa8 KVM: ppc: fix invalidation of large guest pages
When guest invalidates a large tlb map, there may be more than one
corresponding shadow tlb maps that need to be invalidated. Use eaddr and eend
to find these shadow tlb maps.

Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 12:02:05 +03:00
Dmitry Baryshkov
6cab486029 [ARM] 5179/1: Replace obsolete IRQT_* and __IRQT_* values with IRQ_TYPE_*
IRQT_* and __IRQT_* were obsoleted long ago by patch [3692/1].
Remove them completely. Sed script for the reference:

s/__IRQT_RISEDGE/IRQ_TYPE_EDGE_RISING/g
s/__IRQT_FALEDGE/IRQ_TYPE_EDGE_FALLING/g
s/__IRQT_LOWLVL/IRQ_TYPE_LEVEL_LOW/g
s/__IRQT_HIGHLVL/IRQ_TYPE_LEVEL_HIGH/g
s/IRQT_RISING/IRQ_TYPE_EDGE_RISING/g
s/IRQT_FALLING/IRQ_TYPE_EDGE_FALLING/g
s/IRQT_BOTHEDGE/IRQ_TYPE_EDGE_BOTH/g
s/IRQT_LOW/IRQ_TYPE_LEVEL_LOW/g
s/IRQT_HIGH/IRQ_TYPE_LEVEL_HIGH/g
s/IRQT_PROBE/IRQ_TYPE_PROBE/g
s/IRQT_NOEDGE/IRQ_TYPE_NONE/g

Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-07-27 09:46:18 +01:00
Christian Borntraeger
f5e10b09a5 KVM: s390: Fix instruction naming for lctlg
Lets fix the name for the lctlg instruction...

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:36:12 +03:00
Martin Schwidefsky
0096369daa KVM: s390: Change guestaddr type in gaccess
All registers are unsigned long types. This patch changes all occurences
of guestaddr in gaccess from u64 to unsigned long.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:35:57 +03:00
Joerg Roedel
5f4cb662a0 KVM: SVM: allow enabling/disabling NPT by reloading only the architecture module
If NPT is enabled after loading both KVM modules on AMD and it should be
disabled, both KVM modules must be reloaded. If only the architecture module is
reloaded the behavior is undefined. With this patch it is possible to disable
NPT only by reloading the kvm_amd module.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27 11:34:09 +03:00
Linus Torvalds
8be1a6d6c7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  mlx4: Update/add Mellanox Technologies copyright lines to mlx4 driver files
  mlx4_core: Add VLAN tag field to WQE control segment struct
  RDMA/nes: CM connection setup/teardown rework
  IPoIB: Correct help text for INFINIBAND_IPOIB_DEBUG
  IPoIB/cm: Connected mode is no longer EXPERIMENTAL
  RDMA/ucm: BKL is not needed for ib_ucm_open()
  RDMA/ucma: BKL is not needed for ucma_open()
2008-07-26 20:40:36 -07:00
Linus Torvalds
9ee08c2df4 Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (57 commits)
  [MTD] [NAND] subpage read feature as a way to increase performance. 
  CPUFREQ: S3C24XX NAND driver frequency scaling support.
  [MTD][NAND] au1550nd: remove unused variable
  [MTD] jedec_probe: Fix SST 16-bit chip detection
  [MTD][MTDPART] Fix a division by zero bug
  [MTD][MTDPART] Cleanup and document the erase region handling
  [MTD][MTDPART] Handle most checkpatch findings
  [MTD][MTDPART] Seperate main loop from per-partition code in add_mtd_partition
  [MTD] physmap: resume already suspended chips on failure to suspend
  [MTD] physmap: Fix suspend/resume/shutdown bugs.
  [MTD] [NOR] Fix -ETIMEO errors in CFI driver
  [MTD] [NAND] fsl_elbc_nand: fix section mismatch with CONFIG_MTD_OF_PARTS=y
  [JFFS2] Use .unlocked_ioctl
  [MTD] Fix const assignment in the MTD command line partitioning driver
  [MTD] [NOR] gen_probe: No debug message when debugging is disabled
  [MTD] [NAND] remove __PPC__ hardcoded address from DiskOnChip drivers
  [MTD] [MAPS] Remove the bast-flash driver.
  [MTD] [NAND] fsl_elbc_nand: ecclayout cleanups
  [MTD] [NAND] fsl_elbc_nand: implement support for flash-based BBT
  [MTD] [NAND] fsl_elbc_nand: fix OOB workability for large page NAND chips
  ...
2008-07-26 20:30:56 -07:00
Linus Torvalds
eaf0ba5ef6 Merge branch 'tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace
* 'tracehook' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace:
  tracehook: comment fixes
2008-07-26 20:29:39 -07:00
Linus Torvalds
bdee6ac7d1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
  atmel-mci: debugfs support
  mmc: Add per-card debugfs support
  mmc: Export internal host state through debugfs
  imxmmc: fix crash when no platform data is provided
  imxmmc: fix platform resources
  imxmmc: remove DEBUG definition
  mmc_spi: put signals to low power off fix
2008-07-26 20:27:31 -07:00
Linus Torvalds
4836e30078 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (39 commits)
  [PATCH] fix RLIM_NOFILE handling
  [PATCH] get rid of corner case in dup3() entirely
  [PATCH] remove remaining namei_{32,64}.h crap
  [PATCH] get rid of indirect users of namei.h
  [PATCH] get rid of __user_path_lookup_open
  [PATCH] f_count may wrap around
  [PATCH] dup3 fix
  [PATCH] don't pass nameidata to __ncp_lookup_validate()
  [PATCH] don't pass nameidata to gfs2_lookupi()
  [PATCH] new (local) helper: user_path_parent()
  [PATCH] sanitize __user_walk_fd() et.al.
  [PATCH] preparation to __user_walk_fd cleanup
  [PATCH] kill nameidata passing to permission(), rename to inode_permission()
  [PATCH] take noexec checks to very few callers that care
  Re: [PATCH 3/6] vfs: open_exec cleanup
  [patch 4/4] vfs: immutable inode checking cleanup
  [patch 3/4] fat: dont call notify_change
  [patch 2/4] vfs: utimes cleanup
  [patch 1/4] vfs: utimes: move owner check into inode_change_ok()
  [PATCH] vfs: use kstrdup() and check failing allocation
  ...
2008-07-26 20:23:44 -07:00
Linus Torvalds
5c7c204aec Merge git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/kkeil/ISDN-2.6:
  Add layer1 over IP support
  Add mISDN HFC multiport driver
  Add mISDN HFC PCI driver
  Add mISDN DSP
  Add mISDN core files
  Define AF_ISDN and PF_ISDN
  Add mISDN driver
2008-07-26 20:19:41 -07:00
Linus Torvalds
2284284281 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  netns: fix ip_rt_frag_needed rt_is_expired
  netfilter: nf_conntrack_extend: avoid unnecessary "ct->ext" dereferences
  netfilter: fix double-free and use-after free
  netfilter: arptables in netns for real
  netfilter: ip{,6}tables_security: fix future section mismatch
  selinux: use nf_register_hooks()
  netfilter: ebtables: use nf_register_hooks()
  Revert "pkt_sched: sch_sfq: dump a real number of flows"
  qeth: use dev->ml_priv instead of dev->priv
  syncookies: Make sure ECN is disabled
  net: drop unused BUG_TRAP()
  net: convert BUG_TRAP to generic WARN_ON
  drivers/net: convert BUG_TRAP to generic WARN_ON
2008-07-26 20:17:56 -07:00
Andrea Righi
510a35d4a4 hugetlb: remove unused variable warning
Remove the following warning when CONFIG_HUGETLB_PAGE is not set:

	ipc/shm.c: In function `shm_get_stat':
	ipc/shm.c:565: warning: unused variable `h'

[akpm@linux-foundation.org: use tabs, not spaces]
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 20:16:47 -07:00
Michael Buesch
6a9436d0c3 gpiolib: fix typo in comment
This fixes an off-by-one error in a comment.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 20:16:47 -07:00
Al Viro
4cc38a1b38 [PATCH] remove remaining namei_{32,64}.h crap
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:43 -04:00
Al Viro
3f8206d496 [PATCH] get rid of indirect users of namei.h
fs.h needs path.h, not namei.h; nfs_fs.h doesn't need it at all.
Several places in the tree needed direct include.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:42 -04:00
Al Viro
964bd18362 [PATCH] get rid of __user_path_lookup_open
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:41 -04:00
Al Viro
516e0cc564 [PATCH] f_count may wrap around
make it atomic_long_t; while we are at it, get rid of useless checks in affs,
hfs and hpfs - ->open() always has it equal to 1, ->release() - to 0.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:40 -04:00
Al Viro
2d8f30380a [PATCH] sanitize __user_walk_fd() et.al.
* do not pass nameidata; struct path is all the callers want.
* switch to new helpers:
	user_path_at(dfd, pathname, flags, &path)
	user_path(pathname, &path)
	user_lpath(pathname, &path)
	user_path_dir(pathname, &path)  (fail if not a directory)
  The last 3 are trivial macro wrappers for the first one.
* remove nameidata in callers.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:34 -04:00
Al Viro
f419a2e3b6 [PATCH] kill nameidata passing to permission(), rename to inode_permission()
Incidentally, the name that gives hundreds of false positives on grep
is not a good idea...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:31 -04:00
Miklos Szeredi
9767d74957 [patch 1/4] vfs: utimes: move owner check into inode_change_ok()
Add a new ia_valid flag: ATTR_TIMES_SET, to handle the
UTIMES_OMIT/UTIMES_NOW and UTIMES_NOW/UTIMES_OMIT cases.  In these
cases neither ATTR_MTIME_SET nor ATTR_ATIME_SET is in the flags, yet
the POSIX draft specifies that permission checking is performed the
same way as if one or both of the times was explicitly set to a
timestamp.

See the path "vfs: utimensat(): fix error checking for
{UTIME_NOW,UTIME_OMIT} case" by Michael Kerrisk for the patch
introducing this behavior.

This is a cleanup, as well as allowing filesystems (NFS/fuse/...) to
perform their own permission checking instead of the default.

CC: Ulrich Drepper <drepper@redhat.com>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:25 -04:00
Li Zefan
88b387824f [PATCH] vfs: use kstrdup() and check failing allocation
- use kstrdup() instead of kmalloc() + memcpy()
- return NULL if allocating ->mnt_devname failed
- mnt_devname should be const

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:24 -04:00
Al Viro
b77b0646ef [PATCH] pass MAY_OPEN to vfs_permission() explicitly
... and get rid of the last "let's deduce mask from nameidata->flags"
bit.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:22 -04:00
Al Viro
a110343f0d [PATCH] fix MAY_CHDIR/MAY_ACCESS/LOOKUP_ACCESS mess
* MAY_CHDIR is redundant - it's an equivalent of MAY_ACCESS
* MAY_ACCESS on fuse should affect only the last step of pathname resolution
* fchdir() and chroot() should pass MAY_ACCESS, for the same reason why
  chdir() needs that.
* now that we pass MAY_ACCESS explicitly in all cases, LOOKUP_ACCESS can be
  removed; it has no business being in nameidata.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:21 -04:00
Al Viro
7f2da1e7d0 [PATCH] kill altroot
long overdue...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:20 -04:00
Al Viro
8bb79224b8 [PATCH] permission checks for chdir need special treatment only on the last step
... so we ought to pass MAY_CHDIR to vfs_permission() instead of having
it triggered on every step of preceding pathname resolution.  LOOKUP_CHDIR
is killed by that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:19 -04:00
Miklos Szeredi
db2e747b14 [patch 5/5] vfs: remove mode parameter from vfs_symlink()
Remove the unused mode parameter from vfs_symlink and callers.

Thanks to Tetsuo Handa for noticing.

CC: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2008-07-26 20:53:18 -04:00
Miklos Szeredi
2f1936b877 [patch 3/5] vfs: change remove_suid() to file_remove_suid()
All calls to remove_suid() are made with a file pointer, because
(similarly to file_update_time) it is called when the file is written.

Clean up callers by passing in a file instead of a dentry.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
2008-07-26 20:53:16 -04:00
Al Viro
e6305c43ed [PATCH] sanitize ->permission() prototype
* kill nameidata * argument; map the 3 bits in ->flags anybody cares
  about to new MAY_... ones and pass with the mask.
* kill redundant gfs2_iop_permission()
* sanitize ecryptfs_permission()
* fix remaining places where ->permission() instances might barf on new
  MAY_... found in mask.

The obvious next target in that direction is permission(9)

folded fix for nfs_permission() breakage from Miklos Szeredi <mszeredi@suse.cz>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:14 -04:00
Al Viro
9043476f72 [PATCH] sanitize proc_sysctl
* keep references to ctl_table_head and ctl_table in /proc/sys inodes
* grab the former during operations, use the latter for access to
  entry if that succeeds
* have ->d_compare() check if table should be seen for one who does lookup;
  that allows us to avoid flipping inodes - if we have the same name resolve
  to different things, we'll just keep several dentries and ->d_compare()
  will reject the wrong ones.
* have ->lookup() and ->readdir() scan the table of our inode first, then
  walk all ctl_table_header and scan ->attached_by for those that are
  attached to our directory.
* implement ->getattr().
* get rid of insane amounts of tree-walking
* get rid of the need to know dentry in ->permission() and of the contortions
  induced by that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:12 -04:00
Al Viro
ae7edecc9b [PATCH] sysctl: keep track of tree relationships
In a sense, that's the heart of the series.  It's based on the following
property of the trees we are actually asked to add: they can be split into
stem that is already covered by registered trees and crown that is entirely
new.  IOW, if a/b and a/c/d are introduced by our tree, then a/c is also
introduced by it.

That allows to associate tree and table entry with each node in the union;
while directory nodes might be covered by many trees, only one will cover
the node by its crown.  And that will allow much saner logics for /proc/sys
in the next patches.  This patch introduces the data structures needed to
keep track of that.

When adding a sysctl table, we find a "parent" one.  Which is to say,
find the deepest node on its stem that already is present in one of the
tables from our table set or its ancestor sets.  That table will be our
parent and that node in it - attachment point.  Add our table to list
anchored in parent, have it refer the parent and contents of attachment
point.  Also remember where its crown lives.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:11 -04:00
Al Viro
bd7b1533cd [PATCH] sysctl: make sure that /proc/sys/net/ipv4 appears before per-ns ones
Massage ipv4 initialization - make sure that net.ipv4 appears as
non-per-net-namespace before it shows up in per-net-namespace sysctls.
That's the only change outside of sysctl.c needed to get sane ordering
rules and data structures for sysctls (esp. for procfs side of that
mess).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:10 -04:00
Al Viro
f7e6ced406 [PATCH] allow delayed freeing of ctl_table_header
Refcount the sucker; instead of freeing it by the end of unregistration
just drop the refcount and free only when it hits zero.  Make sure that
we _always_ make ->unregistering non-NULL in start_unregistering().

That allows anybody to get a reference to such puppy, preventing its
freeing and reuse.  It does *not* block unregistration.  Anybody who
holds such a reference can
	* try to grab a "use" reference (ctl_head_grab()); that will
succeeds if and only if it hadn't entered unregistration yet.  If it
succeeds, we can use it in all normal ways until we release the "use"
reference (with ctl_head_finish()).  Note that this relies on having
->unregistering become non-NULL in all cases when one starts to unregister
the sucker.
	* keep pointers to ctl_table entries; they *can* be freed if
the entire thing is unregistered.  However, if ctl_head_grab() succeeds,
we know that unregistration had not happened (and will not happen until
ctl_head_finish()) and such pointers can be used safely.

IOW, now we can have inodes under /proc/sys keep references to ctl_table
entries, protecting them with references to ctl_table_header and
grabbing the latter for the duration of operations that require access
to ctl_table.  That won't cause deadlocks, since unregistration will not
be stopped by mere keeping a reference to ctl_table_header.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:09 -04:00
Al Viro
734550921e [PATCH] beginning of sysctl cleanup - ctl_table_set
New object: set of sysctls [currently - root and per-net-ns].
Contains: pointer to parent set, list of tables and "should I see this set?"
method (->is_seen(set)).
Current lists of tables are subsumed by that; net-ns contains such a beast.
->lookup() for ctl_table_root returns pointer to ctl_table_set instead of
that to ->list of that ctl_table_set.

[folded compile fixes by rdd for configs without sysctl]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:08 -04:00
Denys Vlasenko
d2d9648ec6 [PATCH] reuse xxx_fifo_fops for xxx_pipe_fops
Merge fifo and pipe file_operations.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-26 20:53:06 -04:00
Pekka Enberg
93bc4e89c2 netfilter: fix double-free and use-after free
As suggested by Patrick McHardy, introduce a __krealloc() that doesn't
free the original buffer to fix a double-free and use-after-free bug
introduced by me in netfilter that uses RCU.

Reported-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Tested-by: Dieter Ries <clip2@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-26 17:49:33 -07:00
Karsten Keil
af69fb3a8f Add mISDN HFC multiport driver
Enable support for cards with Cologne Chip AG's HFC multiport
chip.

Signed-off-by: Karsten Keil <kkeil@suse.de>
2008-07-27 02:00:43 +02:00
Karsten Keil
960366cf8d Add mISDN DSP
Enable support for digital audio processing capability.
This module may be used for special applications that require
cross connecting of bchannels, conferencing, dtmf decoding
echo cancelation, tone generation, and Blowfish encryption and
decryption.
It may use hardware features if available.

Signed-off-by: Karsten Keil <kkeil@suse.de>
2008-07-27 01:56:38 +02:00
Karsten Keil
1b2b03f8e5 Add mISDN core files
Add mISDN core files

Signed-off-by: Karsten Keil <kkeil@suse.de>
2008-07-27 01:54:58 +02:00
Karsten Keil
04578dd330 Define AF_ISDN and PF_ISDN
Define the address and protocol family value for mISDN.

Signed-off-by: Karsten Keil <kkeil@suse.de>
2008-07-27 01:47:00 +02:00
Haavard Skinnemoen
f4b7f927b5 mmc: Add per-card debugfs support
For each card successfully added to the bus, create a subdirectory under
the host's debugfs root with information about the card.

At the moment, only a single file is added to the card directory for
all cards: "state". It reflects the "state" field in struct mmc_card,
indicating whether the card is present, readonly, etc.

For MMC and SD cards (not SDIO), another file is added: "status".
Reading this file will ask the card about its current status and
return it. This can be useful if the card just refuses to respond to
any commands, which might indicate that the card state is not what the
MMC core thinks it is (due to a missing stop command, for example.)

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-27 01:26:17 +02:00
Haavard Skinnemoen
6edd8ee60a mmc: Export internal host state through debugfs
When CONFIG_DEBUG_FS is set, create a few files under /sys/kernel/debug
containing information about an mmc host's internal state. Currently,
just a single file is created, "ios", which contains information about
the current operating parameters for the bus (clock speed, bus width,
etc.)

Host drivers can add additional files and directories under the host's
root directory by passing the debugfs_root field in struct mmc_host as
the 'parent' parameter to debugfs_create_*.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-27 01:26:16 +02:00
Russell King
d9ecdb282c Merge branch 'for_rmk_13' of git://git.mnementh.co.uk/linux-2.6-im 2008-07-26 23:04:59 +01:00
Roland McGrath
a9906a1919 tracehook: comment fixes
This fixes some typos and errors in <linux/tracehook.h> comments.
No code changes.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-26 14:41:26 -07:00
Roland McGrath
59e52130f0 x86: tracehook: TIF_NOTIFY_RESUME
This adds TIF_NOTIFY_RESUME support for x86, both 64-bit and 32-bit.
When set, we call tracehook_notify_resume() on the way to user mode.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-26 14:38:05 -07:00
Roland McGrath
68bd0f4ef7 x86: tracehook: asm/syscall.h
Add asm/syscall.h for x86 with all the required entry points.
This will allow arch-independent tracing code for system calls.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-26 14:38:01 -07:00
Ian Molton
aafe0ad81d [ARM] pxa: PXA25x UDC - Fix warning during build
Fixes an unterminated ' warning building PXA25X UDC.

Signed-off-by: Ian Molton <spyro@f2s.com>
2008-07-26 22:25:19 +01:00
Linus Torvalds
fb3b806144 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, AMD IOMMU: include amd_iommu_last_bdf in device initialization
  x86: fix IBM Summit based systems' phys_cpu_present_map on 32-bit kernels
  x86, RDC321x: remove gpio.h complications
  x86, RDC321x: add to mach-default
  crashdump: fix undefined reference to `elfcorehdr_addr'
  flag parameters: fix compile error of sys_epoll_create1
2008-07-26 13:25:05 -07:00
Linus Torvalds
7f268a2ba7 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6: (30 commits)
  Blackfin arch: If we double fault, rather than hang forever, reset
  Blackfin arch: When icache is off, make sure people know it
  Blackfin arch: Fix bug - skip single step in high priority interrupt handler instead of disabling all interrupts in single step debugging.
  Blackfin arch: cache the values of vco/sclk/cclk as the overhead of doing so (~24 bytes) is worth avoiding the software mult/div routines
  Blackfin arch: fix bug - IMDMA is not type struct dma_register
  Blackfin arch: check the EXTBANKS field of the DDRCTL1 register to see if we are using both memory banks
  Blackfin arch: Apply Bluetechnix CM-BF527 board support patch
  Blackfin arch: Add unwinding for stack info, and a little more detail on trace buffer
  Blackfin arch: Add ISP1760 board resources to BF548-EZKIT
  Blackfin arch: fix bug - detect 0.1 silicon revision BF527-EZKIT as 0.0 version
  Blackfin arch: add missing IORESOURCE_MEM flags to UART3
  Blackfin arch: Add return value check in bfin_sir_probe(), remove SSYNC().
  Blackfin arch:  Extend sram malloc to handle L2 SRAM.
  Blackfin arch: Remove useless config option.
  Blackfin arch:  change L1 malloc to base on slab cache and lists.
  Blackfin arch: use local labels and ENDPROC() markings
  Blackfin arch: Do not need this dualcore test module in kernel.
  Blackfin arch: Allow ptrace to peek and poke application data in L1 data SRAM.
  Blackfin arch: Add ANOMALY_05000368 workaround
  Blackfin arch: Functional power management support
  ...
2008-07-26 13:23:17 -07:00
Alan Stern
12265709ac [SCSI] scsi_eh_prep_cmnd should save scmd->underflow
This patch (as1116) fixes a bug in scsi_eh_prep_cmnd() and
scsi_eh_restore_cmnd().  These routines are supposed to save any
values they change and restore them later, but someone forgot to
save & restore scmd->underflow.

This fixes part of the problem reported in Bugzilla #9638.

[jejb: fix up rejections around DIF/DIX]
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:56 -04:00
Martin K. Petersen
7027ad72a6 [SCSI] Support devices with protection information
Implement support for DMA of protection information for devices that
are data integrity capable.

 - Add support for mapping an extra scatter-gather list containing
   the protection information.

 - Allocate protection scsi_data_buffer if host is DIX (integrity DMA)
   capable.

 - Accessor function for checking whether a device has protection
   enabled.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:55 -04:00
Martin K. Petersen
db007fc5e2 [SCSI] Command protection operation
Controllers that support DMA of protection information must be told
explicitly how to handle the I/O.  The controller has no knowledge of
the protection capabilities of the target device so this information
must be passed in the scsi_cmnd.

 - The protection operation tells the HBA whether to generate, strip or
   verify protection information.

 - The protection type tells the HBA which layout the target is
   formatted with.  This is necessary because the controller must be
   able to correctly interpret the included protection information in
   order to verify it.

 - When a scsi_cmnd is reused for error handling the protection
   operation must be cleared and saved while error handling is in
   progress.

 - prot_op and prot_type are placed in an existing hole in scsi_cmnd
   and don't cause the structure to grow.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:54 -04:00
Martin K. Petersen
4469f98780 [SCSI] Host protection capabilities
Controllers that support protection information must indicate this to
the SCSI midlayer so that the ULD can prepare scsi_cmnds accordingly.

This patch implements a host mask and various types of protection:

 - DIF Type 1-3 (between HBA and disk)
 - DIX Type 0-3 (between OS and HBA)

The patch also allows the HBA to set the guard type to something
different than the T10-mandated CRC.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:54 -04:00
Hannes Reinecke
ae11b1b36d [SCSI] scsi_dh: attach to hardware handler from dm-mpath
multipath keeps a separate device table which may be
more current than the built-in one.
So we should make sure to always call ->attach whenever
a multipath map with hardware handler is instantiated.
And we should call ->detach on removal, too.

[sekharan: update as per comments from agk]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:53 -04:00
Hannes Reinecke
057ea7c968 [SCSI] scsi_dh: add generic SPC-3 alua handler
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:52 -04:00
Hannes Reinecke
b6ff1b14cd [SCSI] scsi_dh: Update EMC handler
This patch converts the EMC device handler to use a proper
state machine. We now also parse the extended INQUIRY
information to determine if long trespass commands are
supported. And we're now using the long trespass command
correctly. And finally there's now an check at init time
to refuse to attach to devices not supporting EMC-specific
VPD pages.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:51 -04:00
Hannes Reinecke
765cbc6dad [SCSI] scsi_dh: Implement common device table handling
Instead of having each and every driver implement its own
device table scanning code we should rather implement a common
routine and scan the device tables there.
This allows us also to implement a general notifier chain
callback for all device handler instead for one per handler.

[sekharan: Fix rejections caused by conflicting bug fix]
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:51 -04:00
Matthew Wilcox
6d49f63b41 [SCSI] Make host_no an unsigned int
Daniel Debonzi reports that he has managed to wrap host_no.  Increasing
the number of host numbers available to 32-bit from 16-bit allows the
problem to be evaded for another hundred years.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-26 15:14:50 -04:00
Adrian Bunk
9580d85f9c drivers/char/rtc.c: make 2 functions static
The following functions can now become static:
 - rtc_interrupt()
 - rtc_get_rtc_time()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Bernhard Walle <bwalle@suse.de>
Acked-by: Paul Gortmaker <p_gortmaker@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:12 -07:00
Adrian Bunk
7c363b8c65 mm/swapfile.c: make code static
This patch makes the following needlessly global code static:
 - swap_lock
 - nr_swapfiles
 - struct swap_list

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:12 -07:00
Adrian Bunk
15f59adae0 make mm/memory.c:print_bad_pte() static
This patch makes the needlessly global print_bad_pte() static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:12 -07:00
Adrian Bunk
9d8fddfb17 mm/allocpercpu.c: make 4 functions static
This patch makes the following needlessly global functions static:
 - percpu_depopulate()
 - __percpu_depopulate_mask()
 - percpu_populate()
 - __percpu_populate_mask()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:12 -07:00
Roland McGrath
bbc698636e task_current_syscall
This adds the new function task_current_syscall() on machines where the
asm/syscall.h interface is supported (CONFIG_HAVE_ARCH_TRACEHOOK).  It's
exported for modules to use in the future.  This function safely samples
the state of a blocked thread to collect what system call it is blocked
in, and the six system call argument registers.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:10 -07:00
Roland McGrath
85ba2d862e tracehook: wait_task_inactive
This extends wait_task_inactive() with a new argument so it can be used in
a "soft" mode where it will check for the task changing state unexpectedly
and back off.  There is no change to existing callers.  This lays the
groundwork to allow robust, noninvasive tracing that can try to sample a
blocked thread but back off safely if it wakes up.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
828c365cc8 tracehook: asm/syscall.h
This adds asm-generic/syscall.h, which documents what a real
asm-ARCH/syscall.h file should define.  This is not used yet, but will
provide all the machine-dependent details of examining a user system call
about to begin, in progress, or just ended.

Each arch should add an asm-ARCH/syscall.h that defines all the entry
points documented in asm-generic/syscall.h, as short inlines if possible.
This lets us write new tracing code that understands user system call
registers, without any new arch-specific work.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
64b1208d5b tracehook: TIF_NOTIFY_RESUME
This adds tracehook.h inlines to enable a new arch feature in support of
user debugging/tracing.  This is not used yet, but it lays the groundwork
for a debugger to be able to wrangle a task that's possibly running,
without interrupting its syscalls in progress.

Each arch should define TIF_NOTIFY_RESUME, and in their entry.S code treat
it much like TIF_SIGPENDING.  That is, it causes you to take the slow path
when returning to user mode, where you get the full user-mode state
accessible as for signal handling or ptrace.  The arch code should check
TIF_NOTIFY_RESUME after handling TIF_SIGPENDING.  When it's set, clear it
and then call tracehook_notify_resume().

In future, tracing code will call set_notify_resume() when it wants to get
a callback in tracehook_notify_resume().

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
b787f7ba67 tracehook: force signal_pending()
This defines a new hook tracehook_force_sigpending() that lets tracing
code decide to force TIF_SIGPENDING on in recalc_sigpending().

This is not used yet, so it compiles away to nothing for now.  It lays the
groundwork for new tracing code that can interrupt a task synthetically
without actually sending a signal.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
2b2a1ff64a tracehook: death
This moves the ptrace logic in task death (exit_notify) into tracehook.h
inlines.  Some code is rearranged slightly to make things nicer.  There is
no change, only cleanup.

There is one hook called with the tasklist_lock write-locked, as ptrace
needs.  There is also a new hook called after exit_state changes and
without locks.  This is a better place for tracing work to be in the
future, since it doesn't delay the whole system with locking.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
fa00b80b3c tracehook: job control
This defines the tracehook_notify_jctl() hook to formalize the ptrace
effects on the job control notifications.  There is no change, only
cleanup.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
7bcf6a2ca5 tracehook: get_signal_to_deliver
This defines the tracehook_get_signal() hook to allow tracing code to slip
in before normal signal dequeuing.  This lays the groundwork for new
tracing features that can inject synthetic signals outside the normal
queue or control the disposition of delivered signals.  The calling
convention lets tracehook_get_signal() decide both exactly what will
happen and what signal number to report in the handler/exit.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
283d7559e7 tracehook: syscall
This adds standard tracehook.h inlines for arch code to call when
TIF_SYSCALL_TRACE has been set.  This replaces having each arch implement
the ptrace guts for its syscall tracing support.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
445a91d2fe tracehook: tracehook_consider_fatal_signal
This defines tracehook_consider_fatal_signal() has a fine-grained hook for
deciding to skip the special cases for a fatal signal, as ptrace does.
There is no change, only cleanup.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
35de254dc6 tracehook: tracehook_consider_ignored_signal
This defines tracehook_consider_ignored_signal() has a fine-grained hook
for deciding to prevent the normal short-circuit of sending an ignored
signal, as ptrace does.  There is no change, only cleanup.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
c45aea2761 tracehook: tracehook_signal_handler
This defines tracehook_signal_handler() as a hook for the arch signal
handling code to call.  It gives ptrace the opportunity to stop for a
pseudo-single-step trap immediately after signal handler setup is done.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
fa8e26ccd4 tracehook: tracehook_expect_breakpoints
This adds tracehook_expect_breakpoints() as a formal hook for the nommu
code to use for its, "Is text-poking likely?" check at mmap time.  This
names the actual semantics the code means to test, and documents it.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:09 -07:00
Roland McGrath
0d094efeb1 tracehook: tracehook_tracer_task
This adds the tracehook_tracer_task() hook to consolidate all forms of
"Who is using ptrace on me?" logic.  This is used for "TracerPid:" in
/proc and for permission checks.  We also clean up the selinux code the
called an identical accessor.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:08 -07:00
Roland McGrath
dae33574dc tracehook: release_task
This moves the ptrace-related logic from release_task into tracehook.h and
ptrace.h inlines.  It provides clean hooks both before and after locking
tasklist_lock, for future tracing logic to do more cleanup without the
lock.

This also changes release_task() itself in the rare "zap_leader" case to
set the leader to EXIT_DEAD before iterating.  This maintains the
invariant that release_task() only ever handles a task in EXIT_DEAD.  This
is a common-sense invariant that is already always true except in this one
arcane case of zombie leader whose parent ignores SIGCHLD.

This change is harmless and only costs one store in this one rare case.
It keeps the expected state more consisently sane, which is nicer when
debugging weirdness in release_task().  It also lets some future code in
the tracehook entry points rely on this invariant for bookkeeping.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:08 -07:00
Roland McGrath
daded34be9 tracehook: vfork-done
This moves the PTRACE_EVENT_VFORK_DONE tracing into a tracehook.h inline,
tracehook_report_vfork_done().  The change has no effect, just clean-up.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:08 -07:00
Roland McGrath
09a05394fe tracehook: clone
This moves all the ptrace initialization and tracing logic for task
creation into tracehook.h and ptrace.h inlines.  It reorganizes the code
slightly, but should not change any behavior.

There are four tracehook entry points, at each important stage of task
creation.  This keeps the interface from the core fork.c code fairly
clean, while supporting the complex setup required for ptrace or something
like it.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:08 -07:00
Roland McGrath
30199f5a46 tracehook: exit
This moves the PTRACE_EVENT_EXIT tracing into a tracehook.h inline,
tracehook_report_exec().  The change has no effect, just clean-up.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:08 -07:00
Roland McGrath
6341c393fc tracehook: exec
This moves all the ptrace hooks related to exec into tracehook.h inlines.

This also lifts the calls for tracing out of the binfmt load_binary hooks
into search_binary_handler() after it calls into the binfmt module.  This
change has no effect, since all the binfmt modules' load_binary functions
did the call at the end on success, and now search_binary_handler() does
it immediately after return if successful.  We consolidate the repeated
code, and binfmt modules no longer need to import ptrace_notify().

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:08 -07:00
Roland McGrath
88ac2921a7 tracehook: add linux/tracehook.h
This patch series introduces the "tracehook" interface layer of inlines in
<linux/tracehook.h>.  There are more details in the log entry for patch
01/23 and in the header file comments inside that patch.  Most of these
changes move code around with little or no change, and they should not
break anything or change any behavior.

This sets a new standard for uniform arch support to enable clean
arch-independent implementations of new debugging and tracing stuff,
denoted by CONFIG_HAVE_ARCH_TRACEHOOK.  Patch 20/23 adds that symbol to
arch/Kconfig, with comments listing everything an arch has to do before
setting "select HAVE_ARCH_TRACEHOOK".  These are elaborted a bit at:

	http://sourceware.org/systemtap/wiki/utrace/arch/HowTo

The new inlines that arch code must define or call have detailed kerneldoc
comments in the generic header files that say what is required.

No arch is obligated to do any work, and no arch's build should be broken
by these changes.  There are several steps that each arch should take so
it can set HAVE_ARCH_TRACEHOOK.  Most of these are simple.  Providing this
support will let new things people add for doing debugging and tracing of
user-level threads "just work" for your arch in the future.  For an arch
that does not provide HAVE_ARCH_TRACEHOOK, some new options for such
features will not be available for config.

I have done some arch work and will submit this to the arch maintainers
after the generic tracehook series settles in.  For now, that work is
available in my GIT repositories, and in patch and mbox-of-patches form at
http://people.redhat.com/roland/utrace/2.6-current/

This paves the way for my "utrace" work, to be submitted later.  But it is
not innately tied to that.  I hope that the tracehook series can go in
soon regardless of what eventually does or doesn't go on top of it.  For
anyone implementing any kind of new tracing/debugging plan, or just
understanding all the context of the existing ptrace implementation,
having tracehook.h makes things much easier to find and understand.

This patch:

This adds the new kernel-internal header file <linux/tracehook.h>.  This
is not yet used at all.  The comments in the header introduce what the
following series of patches is about.

The aim is to formalize and consolidate all the places that the core
kernel code and the arch code now ties into the ptrace implementation.

These patches mostly don't cause any functional change.  They just move
the details of ptrace logic out of core code into tracehook.h inlines,
where they are mostly compiled away to the same as before.  All that
changes is that everything is thoroughly documented and any future
reworking of ptrace, or addition of something new, would not have to touch
core code all over, just change the tracehook.h inlines.

The new linux/ptrace.h inlines are used by the following patches in the
new tracehook_*() inlines.  Using these helpers for the ptrace event stops
makes it simple to change or disable the old ptrace implementation of
these stops conditionally later.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:08 -07:00
Alexey Dobriyan
51cc50685a SL*B: drop kmem cache argument from constructor
Kmem cache passed to constructor is only needed for constructors that are
themselves multiplexeres.  Nobody uses this "feature", nor does anybody uses
passed kmem cache in non-trivial way, so pass only pointer to object.

Non-trivial places are:
	arch/powerpc/mm/init_64.c
	arch/powerpc/mm/hugetlbpage.c

This is flag day, yes.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Matt Mackall <mpm@selenic.com>
[akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
[akpm@linux-foundation.org: fix mm/slab.c]
[akpm@linux-foundation.org: fix ubifs]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:07 -07:00
Nick Piggin
19fd623127 mm: spinlock tree_lock
mapping->tree_lock has no read lockers.  convert the lock from an rwlock
to a spinlock.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:06 -07:00
Nick Piggin
e286781d5f mm: speculative page references
If we can be sure that elevating the page_count on a pagecache page will
pin it, we can speculatively run this operation, and subsequently check to
see if we hit the right page rather than relying on holding a lock or
otherwise pinning a reference to the page.

This can be done if get_page/put_page behaves consistently throughout the
whole tree (ie.  if we "get" the page after it has been used for something
else, we must be able to free it with a put_page).

Actually, there is a period where the count behaves differently: when the
page is free or if it is a constituent page of a compound page.  We need
an atomic_inc_not_zero operation to ensure we don't try to grab the page
in either case.

This patch introduces the core locking protocol to the pagecache (ie.
adds page_cache_get_speculative, and tweaks some update-side code to make
it work).

Thanks to Hugh for pointing out an improvement to the algorithm setting
page_count to zero when we have control of all references, in order to
hold off speculative getters.

[kamezawa.hiroyu@jp.fujitsu.com: fix migration_entry_wait()]
[hugh@veritas.com: fix add_to_page_cache]
[akpm@linux-foundation.org: repair a comment]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:06 -07:00
Nick Piggin
47feff2c8e radix-tree: add gang_lookup_slot, gang_lookup_slot_tag
Introduce gang_lookup_slot() and gang_lookup_slot_tag() functions, which
are used by lockless pagecache.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:06 -07:00
Nick Piggin
8174c430e4 x86: lockless get_user_pages_fast()
Implement get_user_pages_fast without locking in the fastpath on x86.

Do an optimistic lockless pagetable walk, without taking mmap_sem or any
page table locks or even mmap_sem.  Page table existence is guaranteed by
turning interrupts off (combined with the fact that we're always looking
up the current mm, means we can do the lockless page table walk within the
constraints of the TLB shootdown design).  Basically we can do this
lockless pagetable walk in a similar manner to the way the CPU's pagetable
walker does not have to take any locks to find present ptes.

This patch (combined with the subsequent ones to convert direct IO to use
it) was found to give about 10% performance improvement on a 2 socket 8
core Intel Xeon system running an OLTP workload on DB2 v9.5

 "To test the effects of the patch, an OLTP workload was run on an IBM
  x3850 M2 server with 2 processors (quad-core Intel Xeon processors at
  2.93 GHz) using IBM DB2 v9.5 running Linux 2.6.24rc7 kernel.  Comparing
  runs with and without the patch resulted in an overall performance
  benefit of ~9.8%.  Correspondingly, oprofiles showed that samples from
  __up_read and __down_read routines that is seen during thread contention
  for system resources was reduced from 2.8% down to .05%.  Monitoring the
  /proc/vmstat output from the patched run showed that the counter for
  fast_gup contained a very high number while the fast_gup_slow value was
  zero."

(fast_gup is the old name for get_user_pages_fast, fast_gup_slow is a
counter we had for the number of times the slowpath was invoked).

The main reason for the improvement is that DB2 has multiple threads each
issuing direct-IO.  Direct-IO uses get_user_pages, and thus the threads
contend the mmap_sem cacheline, and can also contend on page table locks.

I would anticipate larger performance gains on larger systems, however I
think DB2 uses an adaptive mix of threads and processes, so it could be
that thread contention remains pretty constant as machine size increases.
In which case, we stuck with "only" a 10% gain.

The downside of using get_user_pages_fast is that if there is not a pte
with the correct permissions for the access, we end up falling back to
get_user_pages and so the get_user_pages_fast is a bit of extra work.
However this should not be the common case in most performance critical
code.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: Kconfig fix]
[akpm@linux-foundation.org: Makefile fix/cleanup]
[akpm@linux-foundation.org: warning fix]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:06 -07:00
Nick Piggin
21cc199baa mm: introduce get_user_pages_fast
Introduce a new get_user_pages_fast mm API, which is basically a
get_user_pages with a less general API (but still tends to be suited to
the common case):

- task and mm are always current and current->mm
- force is always 0
- pages is always non-NULL
- don't pass back vmas

This restricted API can be implemented in a much more scalable way on many
architectures when the ptes are present, by walking the page tables
locklessly (no mmap_sem or page table locks).  When the ptes are not
populated, get_user_pages_fast() could be slower.

This is implemented locklessly on x86, and used in some key direct IO call
sites, in later patches, which provides nearly 10% performance improvement
on a threaded database workload.

Lots of other code could use this too, depending on use cases (eg.  grep
drivers/).  And it might inspire some new and clever ways to use it.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:05 -07:00
Nick Piggin
a0a8f5364a x86: implement pte_special
Implement the pte_special bit for x86.  This is required to support
lockless get_user_pages, because we need to know whether or not we can
refcount a particular page given only its pte (and no vma).

[hugh@veritas.com: fix a BUG]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:05 -07:00
Huang Weiyi
080ccd4573 include/linux/aio.h: removed duplicated include
Removed duplicated include <linux/uio.h> in include/linux/aio.h

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:04 -07:00
Eduard - Gabriel Munteanu
20d8b67c06 relay: add buffer-only channels; useful for early logging
Allows one to create and use a channel with no associated files.  Files
can be initialized later.  This is useful in scenarios such as logging in
early code, before VFS is up.  Therefore, such channels can be created and
used as soon as kmem_cache_init() completed.

This is needed by kmemtrace to do tracing in early kernel code.

[kosaki.motohiro@jp.fujitsu.com: build fix]
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:04 -07:00
Eduard - Gabriel Munteanu
7babe8db99 Full conversion to early_initcall() interface, remove old interface
A previous patch added the early_initcall(), to allow a cleaner hooking of
pre-SMP initcalls.  Now we remove the older interface, converting all
existing users to the new one.

[akpm@linux-foundation.org: cleanups]
[akpm@linux-foundation.org: build fix]
[kosaki.motohiro@jp.fujitsu.com: warning fix]
[kosaki.motohiro@jp.fujitsu.com: warning fix]
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:04 -07:00
Eduard - Gabriel Munteanu
c2147a5092 Better interface for hooking early initcalls
Added early initcall (pre-SMP) support, using an identical interface to
that of regular initcalls.  Functions called from do_pre_smp_initcalls()
could be converted to use this cleaner interface.

This is required by CPU hotplug, because early users have to register
notifiers before going SMP.  One such CPU hotplug user is the relay
interface with buffer-only channels, which needs to register such a
notifier, to be usable in early code.  This in turn is used by kmemtrace.

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:04 -07:00
Huang Ying
89081d17f7 kexec jump: save/restore device state
This patch implements devices state save/restore before after kexec.

This patch together with features in kexec_jump patch can be used for
following:

- A simple hibernation implementation without ACPI support.  You can kexec a
  hibernating kernel, save the memory image of original system and shutdown
  the system.  When resuming, you restore the memory image of original system
  via ordinary kexec load then jump back.

- Kernel/system debug through making system snapshot.  You can make system
  snapshot, jump back, do some thing and make another system snapshot.

- Cooperative multi-kernel/system.  With kexec jump, you can switch between
  several kernels/systems quickly without boot process except the first time.
  This appears like swap a whole kernel/system out/in.

- A general method to call program in physical mode (paging turning
  off). This can be used to invoke BIOS code under Linux.

The following user-space tools can be used with kexec jump:

- kexec-tools needs to be patched to support kexec jump. The patches
  and the precompiled kexec can be download from the following URL:
       source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
       patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
       binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10

- makedumpfile with patches are used as memory image saving tool, it
  can exclude free pages from original kernel memory image file. The
  patches and the precompiled makedumpfile can be download from the
  following URL:
       source: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-src_cvs_kh10.tar.bz2
       patches: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile-patches_cvs_kh10.tar.bz2
       binary: http://khibernation.sourceforge.net/download/release_v10/makedumpfile/makedumpfile_cvs_kh10

- An initramfs image can be used as the root file system of kexeced
  kernel. An initramfs image built with "BuildRoot" can be downloaded
  from the following URL:
       initramfs image: http://khibernation.sourceforge.net/download/release_v10/initramfs/rootfs_cvs_kh10.gz
  All user space tools above are included in the initramfs image.

Usage example of simple hibernation:

1. Compile and install patched kernel with following options selected:

CONFIG_X86_32=y
CONFIG_RELOCATABLE=y
CONFIG_KEXEC=y
CONFIG_CRASH_DUMP=y
CONFIG_PM=y
CONFIG_HIBERNATION=y
CONFIG_KEXEC_JUMP=y

2. Build an initramfs image contains kexec-tool and makedumpfile, or
   download the pre-built initramfs image, called rootfs.gz in
   following text.

3. Prepare a partition to save memory image of original kernel, called
   hibernating partition in following text.

4. Boot kernel compiled in step 1 (kernel A).

5. In the kernel A, load kernel compiled in step 1 (kernel B) with
   /sbin/kexec. The shell command line can be as follow:

   /sbin/kexec --load-preserve-context /boot/bzImage --mem-min=0x100000
     --mem-max=0xffffff --initrd=rootfs.gz

6. Boot the kernel B with following shell command line:

   /sbin/kexec -e

7. The kernel B will boot as normal kexec. In kernel B the memory
   image of kernel A can be saved into hibernating partition as
   follow:

   jump_back_entry=`cat /proc/cmdline | tr ' ' '\n' | grep kexec_jump_back_entry | cut -d '='`
   echo $jump_back_entry > kexec_jump_back_entry
   cp /proc/vmcore dump.elf

   Then you can shutdown the machine as normal.

8. Boot kernel compiled in step 1 (kernel C). Use the rootfs.gz as
   root file system.

9. In kernel C, load the memory image of kernel A as follow:

   /sbin/kexec -l --args-none --entry=`cat kexec_jump_back_entry` dump.elf

10. Jump back to the kernel A as follow:

   /sbin/kexec -e

   Then, kernel A is resumed.

Implementation point:

To support jumping between two kernels, before jumping to (executing)
the new kernel and jumping back to the original kernel, the devices
are put into quiescent state, and the state of devices and CPU is
saved. After jumping back from kexeced kernel and jumping to the new
kernel, the state of devices and CPU are restored accordingly. The
devices/CPU state save/restore code of software suspend is called to
implement corresponding function.

Known issues:

- Because the segment number supported by sys_kexec_load is limited,
  hibernation image with many segments may not be load. This is
  planned to be eliminated by adding a new flag to sys_kexec_load to
  make a image can be loaded with multiple sys_kexec_load invoking.

Now, only the i386 architecture is supported.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:04 -07:00
Huang Ying
3ab8352137 kexec jump
This patch provides an enhancement to kexec/kdump.  It implements the
following features:

- Backup/restore memory used by the original kernel before/after
  kexec.

- Save/restore CPU state before/after kexec.

The features of this patch can be used as a general method to call program in
physical mode (paging turning off).  This can be used to call BIOS code under
Linux.

kexec-tools needs to be patched to support kexec jump. The patches and
the precompiled kexec can be download from the following URL:

       source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
       patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
       binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10

Usage example of calling some physical mode code and return:

1. Compile and install patched kernel with following options selected:

CONFIG_X86_32=y
CONFIG_KEXEC=y
CONFIG_PM=y
CONFIG_KEXEC_JUMP=y

2. Build patched kexec-tool or download the pre-built one.

3. Build some physical mode executable named such as "phy_mode"

4. Boot kernel compiled in step 1.

5. Load physical mode executable with /sbin/kexec. The shell command
   line can be as follow:

   /sbin/kexec --load-preserve-context --args-none phy_mode

6. Call physical mode executable with following shell command line:

   /sbin/kexec -e

Implementation point:

To support jumping without reserving memory.  One shadow backup page (source
page) is allocated for each page used by kexeced code image (destination
page).  When do kexec_load, the image of kexeced code is loaded into source
pages, and before executing, the destination pages and the source pages are
swapped, so the contents of destination pages are backupped.  Before jumping
to the kexeced code image and after jumping back to the original kernel, the
destination pages and the source pages are swapped too.

C ABI (calling convention) is used as communication protocol between
kernel and called code.

A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to
indicate that the loaded kernel image is used for jumping back.

Now, only the i386 architecture is supported.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:04 -07:00
Alex Dubov
17017d8d2c memstick: add "start" and "stop" methods to memstick device
In some cases it may be desirable to ensure that associated driver is not
going to access the media in some period of time.  "start" and "stop"
methods are provided therefore to allow it.

Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:04 -07:00
Alex Dubov
b77899985b memstick: allow "set_param" method to return an error code
Some controllers (Jmicron, for instance) can report temporal failure
condition during power-on.  It is desirable to account for this using a
return value of "set_param" device method.  The return value can also be
handy to distinguish between supported and unsupported device parameters
in run time.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alex Dubov <oakad@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:04 -07:00
Adrian Bunk
929dfb24fb parport/share.c: proper externs
This patch adds proper externs for parport_default_timeslice and
parport_default_spintime in include/linux/parport.h

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:03 -07:00
Alexis Bruemmer
1956a96de4 x86 calgary: fix handling of devices that aren't behind the Calgary
The calgary code can give drivers addresses above 4GB which is very bad
for hardware that is only 32bit DMA addressable.

With this patch, the calgary code sets the global dma_ops to swiotlb or
nommu properly, and the dma_ops of devices behind the Calgary/CalIOC2
to calgary_dma_ops.  So the calgary code can handle devices safely that
aren't behind the Calgary/CalIOC2.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexis Bruemmer <alexisb@us.ibm.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:03 -07:00
FUJITA Tomonori
8d8bb39b9e dma-mapping: add the device argument to dma_mapping_error()
Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
architecture does:

This enables us to cleanly fix the Calgary IOMMU issue that some devices
are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).

I think that per-device dma_mapping_ops support would be also helpful for
KVM people to support PCI passthrough but Andi thinks that this makes it
difficult to support the PCI passthrough (see the above thread).  So I
CC'ed this to KVM camp.  Comments are appreciated.

A pointer to dma_mapping_ops to struct dev_archdata is added.  If the
pointer is non NULL, DMA operations in asm/dma-mapping.h use it.  If it's
NULL, the system-wide dma_ops pointer is used as before.

If it's useful for KVM people, I plan to implement a mechanism to register
a hook called when a new pci (or dma capable) device is created (it works
with hot plugging).  It enables IOMMUs to set up an appropriate
dma_mapping_ops per device.

The major obstacle is that dma_mapping_error doesn't take a pointer to the
device unlike other DMA operations.  So x86 can't have dma_mapping_ops per
device.  Note all the POWER IOMMUs use the same dma_mapping_error function
so this is not a problem for POWER but x86 IOMMUs use different
dma_mapping_error functions.

The first patch adds the device argument to dma_mapping_error.  The patch
is trivial but large since it touches lots of drivers and dma-mapping.h in
all the architecture.

This patch:

dma_mapping_error() doesn't take a pointer to the device unlike other DMA
operations.  So we can't have dma_mapping_ops per device.

Note that POWER already has dma_mapping_ops per device but all the POWER
IOMMUs use the same dma_mapping_error function.  x86 IOMMUs use device
argument.

[akpm@linux-foundation.org: fix sge]
[akpm@linux-foundation.org: fix svc_rdma]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix bnx2x]
[akpm@linux-foundation.org: fix s2io]
[akpm@linux-foundation.org: fix pasemi_mac]
[akpm@linux-foundation.org: fix sdhci]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix sparc]
[akpm@linux-foundation.org: fix ibmvscsi]
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:03 -07:00
Adrian Bunk
44ccac13c7 include/video/atmel_lcdc.h must #include <linux/workqueue.h>
This patch fixes the following compile error caused by commit
d22579b837 ("atmel_lcdfb: FIFO underflow
management"):

  In file included from arch/avr32/boards/atstk1000/atstk1004.c:21:
  include/video/atmel_lcdc.h:40: error: field 'task' has incomplete type

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:01 -07:00
Andrew Morton
16d69265b9 uninline arch_pick_mmap_layout()
Fix this, on avr32:

  include/linux/utsname.h:35,
                   from init/main.c:20:
  include/linux/sched.h: In function 'arch_pick_mmap_layout':
  include/linux/sched.h:2149: error: implicit declaration of function 'PAGE_ALIGN'

Reported-by: Adrian Bunk <bunk@kernel.org>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-26 12:00:01 -07:00
Kumar Gala
4c920de37d powerpc: Fix 8xx build failure
The 'powerpc ioremap_prot' broke 8xx builds:

include2/asm/pgtable-ppc32.h:555: error: '_PAGE_WRITETHRU' undeclared (first use in this function)
include2/asm/pgtable-ppc32.h:555: error: (Each undeclared identifier is reported only once
include2/asm/pgtable-ppc32.h:555: error: for each function it appears in.)

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-26 12:55:09 -05:00
Adrian Bunk
87a9f70465 include/video/atmel_lcdc.h must #include <linux/workqueue.h>
This patch fixes the following compile error caused by
commit d22579b837
(atmel_lcdfb: FIFO underflow management):

<--  snip  -->

...
  CC      arch/avr32/boards/atstk1000/atstk1004.o
In file included from /home/bunk/linux/kernel-2.6/git/linux-2.6/arch/avr32/boards/atstk1000/atstk1004.c:21:
/home/bunk/linux/kernel-2.6/git/linux-2.6/include/video/atmel_lcdc.h:40: error: field 'task' has incomplete type
make[2]: *** [arch/avr32/boards/atstk1000/atstk1004.o] Error 1

<--  snip  -->

Reported-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-07-26 18:40:05 +02:00
Mauro Carvalho Chehab
fdd2a7e2da V4L/DVB (8500a): videotext.h: whitespace cleanup
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-26 13:25:25 -03:00
Hans Verkuil
f894dfd735 V4L/DVB (8488): videodev: remove some CONFIG_VIDEO_V4L1_COMPAT code from v4l2-dev.h
The video_device_create_file and video_device_remove_file functions can be
removed from v4l2-dev.h, removing the dependency on videodev.h in v4l2-dev.h.

Also removed a few more videodev.h includes that should have been videodev2.h.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-26 13:18:11 -03:00
Hans Verkuil
9c39d7eafa V4L/DVB (8483): Remove obsolete owner field from video_device struct.
According to an old comment this should have been removed in 2.6.15.
Better late than never...

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-26 12:55:07 -03:00
Hans Verkuil
a399810ca6 V4L/DVB (8482): videodev: move all ioctl callbacks to a new v4l2_ioctl_ops struct
All ioctl callbacks are now stored in a new v4l2_ioctl_ops struct. Drivers fill in
a const struct v4l2_ioctl_ops and video_device just contains a const pointer to it.

This ensures a clean separation between the const ops struct and the non-const
video_device struct.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-26 12:54:58 -03:00
Hans Verkuil
b654fcdc0e V4L/DVB (8479): tveeprom/ivtv: fix usage of has_ir field
has_ir was set to and compared to -1 in several cases, even though it is
an u32. ivtv also contained a FIXME for an old kernel that could be
removed.

Thanks to Roel Kluin for creating an initial patch for this. Although
I chose a different solution here it did help in pointing out the problem.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-26 12:54:42 -03:00
Hans Verkuil
38f9d30859 V4L/DVB (8477): v4l: remove obsolete audiochip.h
Converted the last users of audiochip.h to the v4l2-chip-ident.h header
and remove the now unused audiochip.h header.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-26 12:54:40 -03:00
Mike Travis
b8d317d10c cpumask: make cpumask_of_cpu_map generic
If an arch doesn't define cpumask_of_cpu_map, create a generic
statically-initialized one for them.  This allows removal of the buggy
cpumask_of_cpu() macro (&cpumask_of_cpu() gives address of
out-of-scope var).

An arch with NR_CPUS of 4096 probably wants to allocate this itself
based on the actual number of CPUs, since otherwise they're using 2MB
of rodata (1024 cpus means 128k).  That's what
CONFIG_HAVE_CPUMASK_OF_CPU_MAP is for (only x86/64 does so at the
moment).

In future as we support more CPUs, we'll need to resort to a
get_cpu_map()/put_cpu_map() allocation scheme.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-26 16:40:32 +02:00
Suresh Siddha
6ffac1e90a x64, fpu: fix possible FPU leakage in error conditions
On Thu, Jul 24, 2008 at 03:43:44PM -0700, Linus Torvalds wrote:
> So how about this patch as a starting point? This is the RightThing(tm) to
> do regardless, and if it then makes it easier to do some other cleanups,
> we should do it first. What do you think?

restore_fpu_checking() calls init_fpu() in error conditions.

While this is wrong(as our main intention is to clear the fpu state of
the thread), this was benign before commit 92d140e21f ("x86: fix taking
DNA during 64bit sigreturn").

Post commit 92d140e21f, live FPU registers may not belong to this
process at this error scenario.

In the error condition for restore_fpu_checking() (especially during the
64bit signal return), we are doing init_fpu(), which saves the live FPU
register state (possibly belonging to some other process context) into
the thread struct (through unlazy_fpu() in init_fpu()). This is wrong
and can leak the FPU data.

For the signal handler restore error condition in restore_i387(), clear
the fpu state present in the thread struct(before ultimately sending a
SIGSEGV for badframe).

For the paranoid error condition check in math_state_restore(), send a
SIGSEGV, if we fail to restore the state.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: <stable@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-26 16:37:04 +02:00
Russell King
dd438e77f0 [ARM] pci: provide dummy pci_get_legacy_ide_irq()
This fixes footbridge_defconfig:

drivers/pnp/resource.c: In function 'pci_dev_uses_irq':
drivers/pnp/resource.c:317: error: implicit declaration of function 'pci_get_legacy_ide_irq'

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-07-26 15:23:26 +01:00
Andrew Morton
0c65f459ce [ARM] fix fls() for 64-bit arguments
arm's fls() is implemented as a macro, causing it to misbehave when passed
64-bit arguments.  Fix.

Cc: Nickolay Vinogradov <nickolay@protei.ru>
Tested-by: Krzysztof Halasa <khc@pm.waw.pl>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-07-26 15:23:25 +01:00
Joerg Roedel
3bc9f79ee1 iommu: add iommu_num_pages helper function
Calculating the number of pages from given address and length numbers is a task
required in multiple IOMMU implementations. So implement this as a generic
function into the IOMMU helper code.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: bhavna.sarathy@amd.com
Cc: robert.richter@amd.com
Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-26 15:43:05 +02:00
Ingo Molnar
bda307ed7b Merge branch 'linus' into x86/cleanups 2008-07-26 15:38:48 +02:00
Ingo Molnar
1503af6619 Merge branch 'linus' into x86/header-guards
Conflicts:

	include/asm-x86/gpio.h
	include/asm-x86/ide.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-26 15:30:40 +02:00
Chris McDermott
1ca9fda4b2 x86: fix IBM Summit based systems' phys_cpu_present_map on 32-bit kernels
x86 kernels on IBM Summit based systems will only online 1 CPU because the
phys_cpu_present_map is not set up correctly. Patch below applied to
2.6.26-git10.

Signed-off-by: Chris McDermott <lcm@linux.vnet.ibm.com>
Tested-by: Tim Pepper <lnxninga@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-26 14:58:39 +02:00
Ingo Molnar
071375bc76 x86, RDC321x: remove gpio.h complications
Remove the include/asm-x86/gpio.h specials, just use the generic
version.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-26 14:50:51 +02:00
Ingo Molnar
36ac26171a crashdump: fix undefined reference to `elfcorehdr_addr'
fix build bug introduced by 95b68dec0d "calgary iommu: use the first
kernels TCE tables in kdump":

arch/x86/kernel/built-in.o: In function `calgary_iommu_init':
(.init.text+0x8399): undefined reference to `elfcorehdr_addr'
arch/x86/kernel/built-in.o: In function `calgary_iommu_init':
(.init.text+0x856c): undefined reference to `elfcorehdr_addr'
arch/x86/kernel/built-in.o: In function `detect_calgary':
(.init.text+0x8c68): undefined reference to `elfcorehdr_addr'
arch/x86/kernel/built-in.o: In function `detect_calgary':
(.init.text+0x8d0c): undefined reference to `elfcorehdr_addr'

make elfcorehdr_addr a generally available symbol.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-26 11:26:23 +02:00
Ilpo Järvinen
ec34c702ca net: drop unused BUG_TRAP()
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-25 21:45:49 -07:00
Ilpo Järvinen
547b792cac net: convert BUG_TRAP to generic WARN_ON
Removes legacy reinvent-the-wheel type thing. The generic
machinery integrates much better to automated debugging aids
such as kerneloops.org (and others), and is unambiguous due to
better naming. Non-intuively BUG_TRAP() is actually equal to
WARN_ON() rather than BUG_ON() though some might actually be
promoted to BUG_ON() but I left that to future.

I could make at least one BUILD_BUG_ON conversion.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-25 21:43:18 -07:00
Grant Likely
284b018973 spi: Add OF binding support for SPI busses
This patch adds support for populating an SPI bus based on data in the
OF device tree.  This is useful for powerpc platforms which use the
device tree instead of discrete code for describing platform layout.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-07-25 22:34:40 -04:00
Grant Likely
dc87c98e8f spi: split up spi_new_device() to allow two stage registration.
spi_new_device() allocates and registers an spi device all in one swoop.
If the driver needs to add extra data to the spi_device before it is
registered, then this causes problems.  This is needed for OF device
tree support so that the SPI device tree helper can add a pointer to
the device node after the device is allocated, but before the device
is registered.  OF aware SPI devices can then retrieve data out of the
device node to populate a platform data structure.

This patch splits the allocation and registration portions of code out
of spi_new_device() and creates two new functions; spi_alloc_device()
and spi_register_device().  spi_new_device() is modified to use the new
functions for allocation and registration.  None of the existing users
of spi_new_device() should be affected by this change.

Drivers using the new API can forego the use of spi_board_info
structure to describe the device layout and populate data into the
spi_device structure directly.

This change is in preparation for adding an OF device tree parser to
generate spi_devices based on data in the device tree.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
2008-07-25 22:34:29 -04:00
Grant Likely
3f07af494d of: adapt of_find_i2c_driver() to be usable by SPI also
SPI has a similar problem as I2C in that it needs to determine an
appropriate modalias value for each device node.  This patch adapts
the of_i2c of_find_i2c_driver() function to be usable by of_spi also.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-07-25 22:25:13 -04:00
Linus Torvalds
1ff8419871 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  ipsec: ipcomp - Decompress into frags if necessary
  ipsec: ipcomp - Merge IPComp implementations
  pkt_sched: Fix locking in shutdown_scheduler_queue()
2008-07-25 17:40:16 -07:00
Linus Torvalds
7b35fa86e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc: Wire up new system calls.
2008-07-25 17:33:34 -07:00
Linus Torvalds
29ca069cc6 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] Wire up new system calls
2008-07-25 17:29:03 -07:00
Harvey Harrison
b4615e69b6 sys_paccept definition missing __user annotation
Introduced by commit aaca0bdca5 ("flag
parameters: paccept"):

  net/socket.c:1515:17: error: symbol 'sys_paccept' redeclared with different type (originally declared at include/linux/syscalls.h:413) - incompatible argument 4 (different address spaces)

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 17:28:49 -07:00
David S. Miller
f1373da87b sparc: Wire up new system calls.
This wires up the recently added Wire up signalfd4, eventfd2,
epoll_create1, dup3, pipe2, and inotify_init1 system calls.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-25 15:18:31 -07:00
Jan Beulich
fb5e2b3797 vmlinux.lds: move __attribute__((__cold__)) functions back into final .text section
Due to the addition of __attribute__((__cold__)) to a few symbols
without adjusting the linker scripts, those symbols currently may end
up outside the [_stext,_etext) range, as they get placed in
.text.unlikely by (at least) gcc 4.3.0. This may confuse code not only
outside of the kernel, symbol_put_addr()'s BUG() could also trigger.
Hence we need to add .text.unlikely (and for future uses of
__attribute__((__hot__)) also .text.hot) to the TEXT_TEXT() macro.

Issue observed by Lukas Lipavsky.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Tested-by: Lukas Lipavsky <llipavsky@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-25 22:12:37 +02:00
Sam Ravnborg
88181ec30f kbuild: only one call for include/ in make headers_*
Move it to the top-level file to decide if we install/check
the generic headers or the arch specific headers.

This revealed a long standing bug where "make headers_check_all"
relied on the files in asm/ for the current architecture.
So make headers_check_all is now broken by this commit.

In addition:

o add a simpler way to detect if an arch support
  exporting header files.

o add 'set -e;' so we error out early if
  make headers_check_all fails.

o add sparc64 and cris to arch we do not process
  in make headers_*_all because:

    sparc64 - use sparc to export headers
    cris    - is know seriously broken

Includes suggestions from: David Woodhouse
<dwmw2@infradead.org>.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: David Woodhouse <dwmw2@infradead.org>
2008-07-25 22:11:44 +02:00
Linus Torvalds
ff5d48a6d1 Merge git://git.infradead.org/embedded-2.6
* git://git.infradead.org/embedded-2.6:
  Make console charset translation optional
2008-07-25 12:02:08 -07:00
Linus Torvalds
762b8291be Merge git://git.infradead.org/~dwmw2/random-2.6
* git://git.infradead.org/~dwmw2/random-2.6:
  remove dummy asm/kvm.h files
  firmware: create firmware binaries during 'make modules'.
2008-07-25 12:01:37 -07:00
Johannes Weiner
c6af5e9f8a bootmem: Move node allocation macros back to !HAVE_ARCH_BOOTMEM_NODE
These got unintentionally moved, put them back as x86 provides its own
versions.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 11:36:44 -07:00
Adrian Bunk
7dcf2a9fce remove dummy asm/kvm.h files
This patch removes the dummy asm/kvm.h files on architectures not (yet)
supporting KVM and uses the same conditional headers installation as
already used for a.out.h .

Also removed are superfluous install rules in the s390 and x86 Kbuild
files (they are already in Kbuild.asm).

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-07-25 14:35:50 -04:00
Linus Torvalds
5047887caf Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (34 commits)
  powerpc: Wireup new syscalls
  Move update_mmu_cache() declaration from tlbflush.h to pgtable.h
  powerpc/pseries: Remove kmalloc call in handling writes to lparcfg
  powerpc/pseries: Update arch vector to indicate support for CMO
  ibmvfc: Add support for collaborative memory overcommit
  ibmvscsi: driver enablement for CMO
  ibmveth: enable driver for CMO
  ibmveth: Automatically enable larger rx buffer pools for larger mtu
  powerpc/pseries: Verify CMO memory entitlement updates with virtual I/O
  powerpc/pseries: vio bus support for CMO
  powerpc/pseries: iommu enablement for CMO
  powerpc/pseries: Add CMO paging statistics
  powerpc/pseries: Add collaborative memory manager
  powerpc/pseries: Utilities to set firmware page state
  powerpc/pseries: Enable CMO feature during platform setup
  powerpc/pseries: Split retrieval of processor entitlement data into a helper routine
  powerpc/pseries: Add memory entitlement capabilities to /proc/ppc64/lparcfg
  powerpc/pseries: Split processor entitlement retrieval and gathering to helper routines
  powerpc/pseries: Remove extraneous error reporting for hcall failures in lparcfg
  powerpc: Fix compile error with binutils 2.15
  ...

Fixed up conflict in arch/powerpc/platforms/52xx/Kconfig manually.
2008-07-25 11:08:17 -07:00
Linus Torvalds
996abf053e Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6
* 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6: (22 commits)
  UBI: always start the background thread
  UBI: fix gcc warning
  UBI: remove pre-sqnum images support
  UBI: fix kernel-doc errors and warnings
  UBI: fix checkpatch.pl errors and warnings
  UBI: bugfix - do not torture PEB needlessly
  UBI: rework scrubbing messages
  UBI: implement multiple volumes rename
  UBI: fix and re-work debugging stuff
  UBI: amend commentaries
  UBI: fix error message
  UBI: improve mkvol request validation
  UBI: add ubi_sync() interface
  UBI: fix 64-bit calculations
  UBI: fix LEB locking
  UBI: fix memory leak on error path
  UBI: do not forget to free internal volumes
  UBI: fix memory leak
  UBI: avoid unnecessary division operations
  UBI: fix buffer padding
  ...
2008-07-25 11:02:17 -07:00
Arthur Jones
8f421c595a edac: i5100 new intel chipset driver
Preliminary support for the Intel 5100 MCH.  CE and UE errors are reported
along with the current DIMM label information and other memory parameters.

Reasons why this is preliminary:

1) This chip has 2 independent memory controllers which, for best
   perforance, use interleaved accesses to the DDR2 memory.  This
   architecture does not map very well to the current edac data structures
   which depend on symmetric channel access to the interleaved data.
   Without core changes, the best I could do for now is to map both memory
   controllers to different csrows (first all ranks of controller 0, then
   all ranks of controller 1).  Someone much more familiar with the edac
   core than I will probably need to come up with a more general data
   structure to handle the interleaving and de-interleaving of the two
   memory controllers.

2) I have not yet tackled the de-interleaving of the rank/controller
   address space into the physical address space of the CPU.  There is
   nothing fundamentally missing, it is just ending up to be a lot of
   code, and I'd rather keep it separate for now, esp since it doesn't
   work yet...

3) The code depends on a particular i5100 chip select to DIMM mainboard
   chip select mapping.  This mapping seems obvious to me in order to
   support dual and single ranked memory, but it is not unique and DIMM
   labels could be wrong on other mainboards.  There is no way to query
   this mapping that I know of.

4) The code requires that the i5100 is in 32GB mode.  Only 4 ranks per
   controller, 2 ranks per DIMM are supported.  I do not have hardware
   (nor do I expect to have hardware anytime soon) for the 48GB (6 ranks
   per controller) mode.

5) The serial presence detect code should be broken out into a "real"
   i2c driver so that decode-dimms.pl can work.

Signed-off-by: Arthur Jones <ajones@riverbed.com>
Signed-off-by: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:48 -07:00
Miklos Szeredi
33670fa296 fuse: nfs export special lookups
Implement the get_parent export operation by sending a LOOKUP request with
".." as the name.

Implement looking up an inode by node ID after it has been evicted from
the cache.  This is done by seding a LOOKUP request with "." as the name
(for all file types, not just directories).

The filesystem can set the FUSE_EXPORT_SUPPORT flag in the INIT reply, to
indicate that it supports these special lookups.

Thanks to John Muir for the original implementation of this feature.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: David Teigland <teigland@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:48 -07:00
Miklos Szeredi
bde74e4bc6 locks: add special return value for asynchronous locks
Use a special error value FILE_LOCK_DEFERRED to mean that a locking
operation returned asynchronously.  This is returned by

  posix_lock_file() for sleeping locks to mean that the lock has been
  queued on the block list, and will be woken up when it might become
  available and needs to be retried (either fl_lmops->fl_notify() is
  called or fl_wait is woken up).

  f_op->lock() to mean either the above, or that the filesystem will
  call back with fl_lmops->fl_grant() when the result of the locking
  operation is known.  The filesystem can do this for sleeping as well
  as non-sleeping locks.

This is to make sure, that return values of -EAGAIN and -EINPROGRESS by
filesystems are not mistaken to mean an asynchronous locking.

This also makes error handling in fs/locks.c and lockd/svclock.c slightly
cleaner.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: David Teigland <teigland@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:47 -07:00
Keika Kobayashi
016ae219b9 per-task-delay-accounting: update taskstats for memory reclaim delay
Add members for memory reclaim delay to taskstats, and accumulate them in
__delayacct_add_tsk() .

Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp>
Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:47 -07:00
Keika Kobayashi
873b477177 per-task-delay-accounting: add memory reclaim delay
Sometimes, application responses become bad under heavy memory load.
Applications take a bit time to reclaim memory.  The statistics, how long
memory reclaim takes, will be useful to measure memory usage.

This patch adds accounting memory reclaim to per-task-delay-accounting for
accounting the time of do_try_to_free_pages().

<i.e>

- When System is under low memory load,
  memory reclaim may not occur.

$ free
             total       used       free     shared    buffers     cached
Mem:       8197800    1577300    6620500          0       4808    1516724
-/+ buffers/cache:      55768    8142032
Swap:     16386292          0   16386292

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0      0 5069748  10612 3014060    0    0     0     0    3   26  0  0 100  0
 0  0      0 5069748  10612 3014060    0    0     0     0    4   22  0  0 100  0
 0  0      0 5069748  10612 3014060    0    0     0     0    3   18  0  0 100  0

Measure the time of tar command.

$ ls -s test.dat
1501472 test.dat

$ time tar cvf test.tar test.dat
real    0m13.388s
user    0m0.116s
sys     0m5.304s

$ ./delayget -d -p <pid>
CPU             count     real total  virtual total    delay total
                  428     5528345500     5477116080       62749891
IO              count    delay total
                  338     8078977189
SWAP            count    delay total
                    0              0
RECLAIM         count    delay total
                    0              0

- When system is under heavy memory load
  memory reclaim may occur.

$ vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 0  0 7159032  49724   1812   3012    0    0     0     0    3   24  0  0 100  0
 0  0 7159032  49724   1812   3012    0    0     0     0    4   24  0  0 100  0
 0  0 7159032  49848   1812   3012    0    0     0     0    3   22  0  0 100  0

In this case, one process uses more 8G memory
by execution of malloc() and memset().

$ time tar cvf test.tar test.dat
real    1m38.563s        <-  increased by 85 sec
user    0m0.140s
sys     0m7.060s

$ ./delayget -d -p <pid>
CPU             count     real total  virtual total    delay total
                 9021     7140446250     7315277975      923201824
IO              count    delay total
                 8965    90466349669
SWAP            count    delay total
                    3       21036367
RECLAIM         count    delay total
                  740    61011951153

In the later case, the value of RECLAIM is increasing.
So, taskstats can show how much memory reclaim influences TAT.

Signed-off-by: Keika Kobayashi <kobayashi.kk@ncos.nec.co.jp>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujistu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:47 -07:00
Andrea Righi
297c5d9263 task IO accounting: provide distinct tgid/tid I/O statistics
Report per-thread I/O statistics in /proc/pid/task/tid/io and aggregate
parent I/O statistics in /proc/pid/io.  This approach follows the same
model used to account per-process and per-thread CPU times.

As a practial application, this allows for example to quickly find the top
I/O consumer when a process spawns many child threads that perform the
actual I/O work, because the aggregated I/O statistics can always be found
in /proc/pid/io.

[ Oleg Nesterov points out that we should check that the task is still
  alive before we iterate over the threads, but also says that we can do
  that fixup on top of this later.  - Linus ]

Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: Matt Heaton <matt@hostmonster.com>
Cc: Shailabh Nagar <nagar@watson.ibm.com>
Acked-by-with-comments: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:47 -07:00
Pavel Emelyanov
0b6b030fc3 bsdacct: switch from global bsd_acct_struct instance to per-pidns one
Allocate the structure on the first call to sys_acct().  After this each
namespace, that ordered the accounting, will live with this structure till
its own death.

Two notes
- routines, that close the accounting on fs umount time use
  the init_pid_ns's acct by now;
- accounting routine accounts to dying task's namespace
  (also by now).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:47 -07:00
Pavel Emelyanov
20fad13ac6 pidns: add the struct bsd_acct_struct pointer on pid_namespace struct
All the bsdacct-related info will be stored in the area, pointer by this
one.

It will be NULL automatically for all new namespaces.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:46 -07:00
Jonathan Lim
49b5cf3472 accounting: account for user time when updating memory integrals
Adapt acct_update_integrals() to include user time when calculating the time
difference.  The units of acct_rss_mem1 and acct_vm_mem1 are also changed from
pages-jiffies to pages-usecs to avoid calling jiffies_to_usecs() in
xacct_add_tsk() which might overflow.

Signed-off-by: Jonathan Lim <jlim@sgi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:46 -07:00
Pavel Emelyanov
dbda0de526 pidns: remove find_task_by_pid, unused for a long time
It seems to me that it was a mistake marking this function as deprecated
and scheduling it for removal, rather than resolutely removing it after
the last caller's death.

Anyway - better late, then never.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:45 -07:00
Pavel Emelyanov
e49859e71e pidns: remove now unused find_pid function.
This one had the only users so far - the kill_proc, which is removed, so
drop this (invalid in namespaced world) call too.

And of course - erase all references on it from comments.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:45 -07:00
Pavel Emelyanov
19b0cfcca4 pidns: remove now unused kill_proc function
This function operated on a pid_t to kill a task, which is no longer valid
in a containerized system.

It has finally lost all its users and we can safely remove it from the
tree.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:45 -07:00
Richard Kennedy
33166b1ffc shrink struct pid by removing padding on 64 bit builds
When struct pid is built on a 64 bit platform gcc has to insert padding to
maintain the correct alignment, by simply reordering its members the
memory usage shrinks from 88 bytes to 80.

I've successfully run with this patch on my desktop AMD64 machine.

There are no significant kernel size changes to a default config.X86_64
on the latest git v2.6.26-rc1

   text    data     bss     dec     hex filename
5404828  976760  734280 7115868  6c945c vmlinux
5404811  976760  734280 7115851  6c944b vmlinux.pid-patch

Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:45 -07:00
Adrian Bunk
3ae4eed34b proper pid{hash,map}_init() prototypes
This patch adds proper prototypes for pid{hash,map}_init() in
include/linux/pid_namespace.h

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:45 -07:00
Alexey Dobriyan
881adb8535 proc: always do ->release
Current two-stage scheme of removing PDE emphasizes one bug in proc:

		open
				rmmod
				remove_proc_entry
		close

->release won't be called because ->proc_fops were cleared.  In simple
cases it's small memory leak.

For every ->open, ->release has to be done.  List of openers is introduced
which is traversed at remove_proc_entry() if neeeded.

Discussions with Al long ago (sigh).

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:44 -07:00
Adrian Bunk
6e644c3126 move proc_kmsg_operations to fs/proc/internal.h
This patch moves the extern of struct proc_kmsg_operations to
fs/proc/internal.h and adds an #include "internal.h" to fs/proc/kmsg.c
so that the latter sees the former.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:44 -07:00
Abdel Benamrouche
d805dda412 fs/partition/check.c: fix return value warning
fs/partitions/check.c:381: warning: ignoring return value of ___device_add___,
  declared with attribute warn_unused_result

[akpm@linux-foundation.org: multiple-return-statements-per-function are evil]
Signed-off-by: Abdel Benamrouche <draconux@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:44 -07:00
Edgar E. Iglesias
79885b2277 elf: use ELF_CORE_EFLAGS for kcore ELF header flags
ELF_CORE_EFLAGS is already used by the binfmt_elf coredumper to set correct
arch specific ELF header flags on coredumps.  Use it for kcore dumps as well.
At the moment, this affects the CRIS and the H8300 arch.

Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:42 -07:00
Nadia Derbey
9eefe520c8 ipc: do not use a negative value to re-enable msgmni automatic recomputing
This patch proposes an alternative to the "magical
positive-versus-negative number trick" Andrew complained about last week
in http://lkml.org/lkml/2008/6/24/418.

This had been introduced with the patches that scale msgmni to the amount
of lowmem.  With these patches, msgmni has a registered notification
routine that recomputes msgmni value upon memory add/remove or ipc
namespace creation/ removal.

When msgmni is changed from user space (i.e.  value written to the proc
file), that notification routine is unregistered, and the way to make it
registered back is to write a negative value into the proc file.  This is
the "magical positive-versus-negative number trick".

To fix this, a new proc file is introduced: /proc/sys/kernel/auto_msgmni.
This file acts as ON/OFF for msgmni automatic recomputing.

With this patch, the process is the following:
1) kernel boots in "automatic recomputing mode"
   /proc/sys/kernel/msgmni contains the value that has been computed (depends
                           on lowmem)
   /proc/sys/kernel/automatic_msgmni contains "1"

2) echo <val> > /proc/sys/kernel/msgmni
   . sets msg_ctlmni to <val>
   . de-activates automatic recomputing (i.e. if, say, some memory is added
     msgmni won't be recomputed anymore)
   . /proc/sys/kernel/automatic_msgmni now contains "0"

3) echo "0" > /proc/sys/kernel/automatic_msgmni
   . de-activates msgmni automatic recomputing
     this has the same effect as 2) except that msg_ctlmni's value stays
     blocked at its current value)

3) echo "1" > /proc/sys/kernel/automatic_msgmni
   . recomputes msgmni's value based on the current available memory size
     and number of ipc namespaces
   . re-activates automatic recomputing for msgmni.

Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Solofo Ramangalahy <Solofo.Ramangalahy@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:42 -07:00
Manfred Spraul
380af1b33b ipc/sem.c: rewrite undo list locking
The attached patch:
- reverses the locking order of ulp->lock and sem_lock:
  Previously, it was first ulp->lock, then inside sem_lock.
  Now it's the other way around.
- converts the undo structure to rcu.

Benefits:
- With the old locking order, IPC_RMID could not kfree the undo structures.
  The stale entries remained in the linked lists and were released later.
- The patch fixes a a race in semtimedop(): if both IPC_RMID and a semget() that
  recreates exactly the same id happen between find_alloc_undo() and sem_lock,
  then semtimedop() would access already kfree'd memory.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:42 -07:00
Manfred Spraul
a1193f8ec0 ipc/sem.c: convert sem_array.sem_pending to struct list_head
sem_array.sem_pending is a double linked list, the attached patch converts
it to struct list_head.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:42 -07:00
Manfred Spraul
2c0c29d414 ipc/sem.c: remove unused entries from struct sem_queue
sem_queue.sma and sem_queue.id were never used, the attached patch removes
them.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reviewed-by: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:42 -07:00
Manfred Spraul
4daa28f6d8 ipc/sem.c: convert undo structures to struct list_head
The undo structures contain two linked lists, the attached patch replaces
them with generic struct list_head lists.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:42 -07:00
Nadia Derbey
f9c46d6ea5 idr: make idr_find rcu-safe
Make idr_find rcu-safe: it can now be called inside an rcu_read critical
section.

Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Jim Houston <jim.houston@comcast.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:42 -07:00
Nadia Derbey
944ca05c7b idr: error checking factorization
Do some code factorization in the return code analysis.

Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Jim Houston <jim.houston@comcast.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:41 -07:00
Nadia Derbey
2027d1abc2 idr: change the idr structure
After scalability problems have been detected when using the sysV ipcs, I have
proposed to use an RCU based implementation of the IDR api instead (see
threads http://lkml.org/lkml/2008/4/11/212 and
http://lkml.org/lkml/2008/4/29/295).

This resulted in many people asking to convert the idr API and make it rcu
safe (because most of the code was duplicated and thus unmaintanable and
unreviewable).

So here is a first attempt.

The important change wrt to the idr API itself is during idr removes: idr
layers are freed after a grace period, instead of being moved to the free
list.

The important change wrt to ipcs, is that idr_find() can now be called
locklessly inside a rcu read critical section.

Here are the results I've got for the pmsg test sent by Manfred:

   2.6.25-rc3-mm1   2.6.25-rc3-mm1+   2.6.25-mm1   Patched 2.6.25-mm1
1         1168441           1064021       876000               947488
2         1094264            921059      1549592              1730685
3         2082520           1738165      1694370              2324880
4         2079929           1695521       404553              2400408
5         2898758            406566       391283              3246580
6         2921417            261275       263249              3752148
7         3308761            126056       191742              4243142
8         3329456            100129       141722              4275780

1st column: stock 2.6.25-rc3-mm1
2nd column: 2.6.25-rc3-mm1 + ipc patches (store ipcs into idrs)
3nd column: stock 2.6.25-mm1
4th column: 2.6.25-mm1 + this pacth series.

This patch:

Add an rcu_head to the idr_layer structure in order to free it after a grace
period.

Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
Reviewed-by: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Jim Houston <jim.houston@comcast.net>
Cc: Pierre Peiffer <peifferp@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:41 -07:00
Chandru
95b68dec0d calgary iommu: use the first kernels TCE tables in kdump
kdump kernel fails to boot with calgary iommu and aacraid driver on a x366
box.  The ongoing dma's of aacraid from the first kernel continue to exist
until the driver is loaded in the kdump kernel.  Calgary is initialized
prior to aacraid and creation of new tce tables causes wrong dma's to
occur.  Here we try to get the tce tables of the first kernel in kdump
kernel and use them.  While in the kdump kernel we do not allocate new tce
tables but instead read the base address register contents of calgary
iommu and use the tables that the registers point to.  With these changes
the kdump kernel and hence aacraid now boots normally.

Signed-off-by: Chandru Siddalingappa <chandru@in.ibm.com>
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:41 -07:00
Oleg Nesterov
3da1c84c00 workqueues: make get_online_cpus() useable for work->func()
workqueue_cpu_callback(CPU_DEAD) flushes cwq->thread under
cpu_maps_update_begin().  This means that the multithreaded workqueues
can't use get_online_cpus() due to the possible deadlock, very bad and
very old problem.

Introduce the new state, CPU_POST_DEAD, which is called after
cpu_hotplug_done() but before cpu_maps_update_done().

Change workqueue_cpu_callback() to use CPU_POST_DEAD instead of CPU_DEAD.
This means that create/destroy functions can't rely on get_online_cpus()
any longer and should take cpu_add_remove_lock instead.

[akpm@linux-foundation.org: fix CONFIG_SMP=n]
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Gautham R Shenoy <ego@in.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Paul Menage <menage@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:40 -07:00
Oleg Nesterov
db70089722 workqueues: implement flush_work()
Most of users of flush_workqueue() can be changed to use cancel_work_sync(),
but sometimes we really need to wait for the completion and cancelling is not
an option. schedule_on_each_cpu() is good example.

Add the new helper, flush_work(work), which waits for the completion of the
specific work_struct. More precisely, it "flushes" the result of of the last
queue_work() which is visible to the caller.

For example, this code

	queue_work(wq, work);
	/* WINDOW */
	queue_work(wq, work);

	flush_work(work);

doesn't necessary work "as expected". What can happen in the WINDOW above is

	- wq starts the execution of work->func()

	- the caller migrates to another CPU

now, after the 2nd queue_work() this work is active on the previous CPU, and
at the same time it is queued on another. In this case flush_work(work) may
return before the first work->func() completes.

It is trivial to add another helper

	int flush_work_sync(struct work_struct *work)
	{
		return flush_work(work) || wait_on_work(work);
	}

which works "more correctly", but it has to iterate over all CPUs and thus
it much slower than flush_work().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Max Krasnyansky <maxk@qualcomm.com>
Acked-by: Jarek Poplawski <jarkao2@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:40 -07:00
Oleg Nesterov
a94e2d408e coredump: kill mm->core_done
Now that we have core_state->dumper list we can use it to wake up the
sub-threads waiting for the coredump completion.

This uglifies the code and .text grows by 47 bytes, but otoh mm_struct
lessens by sizeof(struct completion).  Also, with this change we can
decouple exit_mm() from the coredumping code.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:40 -07:00
Oleg Nesterov
b564daf806 coredump: construct the list of coredumping threads at startup time
binfmt->core_dump() has to iterate over the all threads in system in order
to find the coredumping threads and construct the list using the
GFP_ATOMIC allocations.

With this patch each thread allocates the list node on exit_mm()'s stack and
adds itself to the list.

This allows us to do further changes:

	- simplify ->core_dump()

	- change exit_mm() to clear ->mm first, then wait for ->core_done.
	  this makes the coredumping process visible to oom_kill

	- kill mm->core_done

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:40 -07:00
Oleg Nesterov
c5f1cc8c18 coredump: turn core_state->nr_threads into atomic_t
Turn core_state->nr_threads into atomic_t and kill now unneeded
down_write(&mm->mmap_sem) in exit_mm().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:39 -07:00
Oleg Nesterov
999d9fc167 coredump: move mm->core_waiters into struct core_state
Move mm->core_waiters into "struct core_state" allocated on stack.  This
shrinks mm_struct a little bit and allows further changes.

This patch mostly does s/core_waiters/core_state.  The only essential
change is that coredump_wait() must clear mm->core_state before return.

The coredump_wait()'s path is uglified and .text grows by 30 bytes, this
is fixed by the next patch.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:39 -07:00
Oleg Nesterov
32ecb1f26d coredump: turn mm->core_startup_done into the pointer to struct core_state
mm->core_startup_done points to "struct completion startup_done" allocated
on the coredump_wait()'s stack.  Introduce the new structure, core_state,
which holds this "struct completion".  This way we can add more info
visible to the threads participating in coredump without enlarging
mm_struct.

No changes in affected .o files.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:39 -07:00
Oleg Nesterov
246bb0b1de kill PF_BORROWED_MM in favour of PF_KTHREAD
Kill PF_BORROWED_MM.  Change use_mm/unuse_mm to not play with ->flags, and
do s/PF_BORROWED_MM/PF_KTHREAD/ for a couple of other users.

No functional changes yet.  But this allows us to do further
fixes/cleanups.

oom_kill/ptrace/etc often check "p->mm != NULL" to filter out the
kthreads, this is wrong because of use_mm().  The problem with
PF_BORROWED_MM is that we need task_lock() to avoid races.  With this
patch we can check PF_KTHREAD directly, or use a simple lockless helper:

	/* The result must not be dereferenced !!! */
	struct mm_struct *__get_task_mm(struct task_struct *tsk)
	{
		if (tsk->flags & PF_KTHREAD)
			return NULL;
		return tsk->mm;
	}

Note also ecard_task().  It runs with ->mm != NULL, but it's the kernel
thread without PF_BORROWED_MM.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:39 -07:00
Oleg Nesterov
7b34e4283c introduce PF_KTHREAD flag
Introduce the new PF_KTHREAD flag to mark the kernel threads.  It is set
by INIT_TASK() and copied to the forked childs (we could set it in
kthreadd() along with PF_NOFREEZE instead).

daemonize() was changed as well.  In that case testing of PF_KTHREAD is
racy, but daemonize() is hopeless anyway.

This flag is cleared in do_execve(), before search_binary_handler().
Probably not the best place, we can do this in exec_mmap() or in
start_thread(), or clear it along with PF_FORKNOEXEC.  But I think this
doesn't matter in practice, and if do_execve() fails kthread should die
soon.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:39 -07:00
Oleg Nesterov
364d3c13c1 ptrace: give more respect to SIGKILL
ptrace_stop() has some complicated checks to prevent the scheduling in the
TASK_TRACED state with the pending SIGKILL, but these checks are racy, and
they depend on arch_ptrace_stop_needed().

This patch assumes that the traced task should die asap if it was killed by
SIGKILL, in that case schedule()->signal_pending_state() has no reason to
ignore the TASK_WAKEKILL part of TASK_TRACED, and we can kill this nasty
special case.

Note: do_exit()->ptrace_notify() is special, the killed task can already
dequeue SIGKILL at this point. Another indication that fatal_signal_pending()
is not exactly right.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:39 -07:00
Adrian Bunk
f22ab814a2 include/asm/ptrace.h userspace headers cleanup
This patch contains the following cleanups for the asm/ptrace.h
userspace headers:

- include/asm-generic/Kbuild.asm already lists ptrace.h, remove
  the superfluous listings in the Kbuild files of the following
  architectures:
  - cris
  - frv
  - powerpc
  - x86
- don't expose function prototypes and macros to userspace:
  - arm
  - blackfin
  - cris
  - mn10300
  - parisc
- remove #ifdef CONFIG_'s around #define's:
  - blackfin
  - m68knommu
- sh: AFAIK __SH5__ should work in both kernel and userspace,
      no need to leak CONFIG_SUPERH64 to userspace
- xtensa: cosmetical change to remove empty
            #ifndef __ASSEMBLY__ #else #endif
          from the userspace headers

Not changed by this patch is the fact that the following architectures
have a different struct pt_regs depending on CONFIG_ variables:
- h8300
- m68knommu
- mips

This does not work in userspace.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com>
Acked-by: Chris Zankel <chris@zankel.net>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:39 -07:00
KAMEZAWA Hiroyuki
12b9804419 res_counter: limit change support ebusy
Add an interface to set limit.  This is necessary to memory resource
controller because it shrinks usage at set limit.

Other controllers may not need this interface to shrink usage because
shrinking is not necessary or impossible.

Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:37 -07:00
KAMEZAWA Hiroyuki
c9b0ed5148 memcg: helper function for relcaim from shmem.
A new call, mem_cgroup_shrink_usage() is added for shmem handling and
relacing non-standard usage of mem_cgroup_charge/uncharge.

Now, shmem calls mem_cgroup_charge() just for reclaim some pages from
mem_cgroup.  In general, shmem is used by some process group and not for
global resource (like file caches).  So, it's reasonable to reclaim pages
from mem_cgroup where shmem is mainly used.

[hugh@veritas.com: shmem_getpage release page sooner]
[hugh@veritas.com: mem_cgroup_shrink_usage css_put]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:37 -07:00
KAMEZAWA Hiroyuki
69029cd550 memcg: remove refcnt from page_cgroup
memcg: performance improvements

Patch Description
 1/5 ... remove refcnt fron page_cgroup patch (shmem handling is fixed)
 2/5 ... swapcache handling patch
 3/5 ... add helper function for shmem's memory reclaim patch
 4/5 ... optimize by likely/unlikely ppatch
 5/5 ... remove redundunt check patch (shmem handling is fixed.)

Unix bench result.

== 2.6.26-rc2-mm1 + memory resource controller
Execl Throughput                           2915.4 lps   (29.6 secs, 3 samples)
C Compiler Throughput                      1019.3 lpm   (60.0 secs, 3 samples)
Shell Scripts (1 concurrent)               5796.0 lpm   (60.0 secs, 3 samples)
Shell Scripts (8 concurrent)               1097.7 lpm   (60.0 secs, 3 samples)
Shell Scripts (16 concurrent)               565.3 lpm   (60.0 secs, 3 samples)
File Read 1024 bufsize 2000 maxblocks    1022128.0 KBps  (30.0 secs, 3 samples)
File Write 1024 bufsize 2000 maxblocks   544057.0 KBps  (30.0 secs, 3 samples)
File Copy 1024 bufsize 2000 maxblocks    346481.0 KBps  (30.0 secs, 3 samples)
File Read 256 bufsize 500 maxblocks      319325.0 KBps  (30.0 secs, 3 samples)
File Write 256 bufsize 500 maxblocks     148788.0 KBps  (30.0 secs, 3 samples)
File Copy 256 bufsize 500 maxblocks       99051.0 KBps  (30.0 secs, 3 samples)
File Read 4096 bufsize 8000 maxblocks    2058917.0 KBps  (30.0 secs, 3 samples)
File Write 4096 bufsize 8000 maxblocks   1606109.0 KBps  (30.0 secs, 3 samples)
File Copy 4096 bufsize 8000 maxblocks    854789.0 KBps  (30.0 secs, 3 samples)
Dc: sqrt(2) to 99 decimal places         126145.2 lpm   (30.0 secs, 3 samples)

                     INDEX VALUES
TEST                                        BASELINE     RESULT      INDEX

Execl Throughput                                43.0     2915.4      678.0
File Copy 1024 bufsize 2000 maxblocks         3960.0   346481.0      875.0
File Copy 256 bufsize 500 maxblocks           1655.0    99051.0      598.5
File Copy 4096 bufsize 8000 maxblocks         5800.0   854789.0     1473.8
Shell Scripts (8 concurrent)                     6.0     1097.7     1829.5
                                                                 =========
     FINAL SCORE                                                     991.3

== 2.6.26-rc2-mm1 + this set ==
Execl Throughput                           3012.9 lps   (29.9 secs, 3 samples)
C Compiler Throughput                       981.0 lpm   (60.0 secs, 3 samples)
Shell Scripts (1 concurrent)               5872.0 lpm   (60.0 secs, 3 samples)
Shell Scripts (8 concurrent)               1120.3 lpm   (60.0 secs, 3 samples)
Shell Scripts (16 concurrent)               578.0 lpm   (60.0 secs, 3 samples)
File Read 1024 bufsize 2000 maxblocks    1003993.0 KBps  (30.0 secs, 3 samples)
File Write 1024 bufsize 2000 maxblocks   550452.0 KBps  (30.0 secs, 3 samples)
File Copy 1024 bufsize 2000 maxblocks    347159.0 KBps  (30.0 secs, 3 samples)
File Read 256 bufsize 500 maxblocks      314644.0 KBps  (30.0 secs, 3 samples)
File Write 256 bufsize 500 maxblocks     151852.0 KBps  (30.0 secs, 3 samples)
File Copy 256 bufsize 500 maxblocks      101000.0 KBps  (30.0 secs, 3 samples)
File Read 4096 bufsize 8000 maxblocks    2033256.0 KBps  (30.0 secs, 3 samples)
File Write 4096 bufsize 8000 maxblocks   1611814.0 KBps  (30.0 secs, 3 samples)
File Copy 4096 bufsize 8000 maxblocks    847979.0 KBps  (30.0 secs, 3 samples)
Dc: sqrt(2) to 99 decimal places         128148.7 lpm   (30.0 secs, 3 samples)

                     INDEX VALUES
TEST                                        BASELINE     RESULT      INDEX

Execl Throughput                                43.0     3012.9      700.7
File Copy 1024 bufsize 2000 maxblocks         3960.0   347159.0      876.7
File Copy 256 bufsize 500 maxblocks           1655.0   101000.0      610.3
File Copy 4096 bufsize 8000 maxblocks         5800.0   847979.0     1462.0
Shell Scripts (8 concurrent)                     6.0     1120.3     1867.2
                                                                 =========
     FINAL SCORE                                                    1004.6

This patch:

Remove refcnt from page_cgroup().

After this,

 * A page is charged only when !page_mapped() && no page_cgroup is assigned.
	* Anon page is newly mapped.
	* File page is added to mapping->tree.

 * A page is uncharged only when
	* Anon page is fully unmapped.
	* File page is removed from LRU.

There is no change in behavior from user's view.

This patch also removes unnecessary calls in rmap.c which was used only for
refcnt mangement.

[akpm@linux-foundation.org: fix warning]
[hugh@veritas.com: fix shmem_unuse_inode charging]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: Paul Menage <menage@google.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:37 -07:00
KAMEZAWA Hiroyuki
e8589cc189 memcg: better migration handling
This patch changes page migration under memory controller to use a
different algorithm.  (thanks to Christoph for new idea.)

Before:
 - page_cgroup is migrated from an old page to a new page.
After:
 - a new page is accounted , no reuse of page_cgroup.

Pros:

 - We can avoid compliated lock depndencies and races in migration.

Cons:

 - new param to mem_cgroup_charge_common().

 - mem_cgroup_getref() is added for handling ref_cnt ping-pong.

This version simplifies complicated lock dependency in page migraiton
under memory resource controller.

  new refcnt sequence is following.

a mapped page:
  prepage_migration() ..... +1 to NEW page
  try_to_unmap()      ..... all refs to OLD page is gone.
  move_pages()        ..... +1 to NEW page if page cache.
  remap...            ..... all refs from *map* is added to NEW one.
  end_migration()     ..... -1 to New page.

  page's mapcount + (page_is_cache) refs are added to NEW one.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:37 -07:00
Serge E. Hallyn
e885dcde75 cgroup_clone: use pid of newly created task for new cgroup
cgroup_clone creates a new cgroup with the pid of the task.  This works
correctly for unshare, but for clone cgroup_clone is called from
copy_namespaces inside copy_process, which happens before the new pid is
created.  As a result, the new cgroup was created with current's pid.
This patch:

	1. Moves the call inside copy_process to after the new pid
	   is created
	2. Passes the struct pid into ns_cgroup_clone (as it is not
	   yet attached to the task)
	3. Passes a name from ns_cgroup_clone() into cgroup_clone()
	   so as to keep cgroup_clone() itself simpler
	4. Uses pid_vnr() to get the process id value, so that the
	   pid used to name the new cgroup is always the pid as it
	   would be known to the task which did the cloning or
	   unsharing.  I think that is the most intuitive thing to
	   do.  This way, task t1 does clone(CLONE_NEWPID) to get
	   t2, which does clone(CLONE_NEWPID) to get t3, then the
	   cgroup for t3 will be named for the pid by which t2 knows
	   t3.

(Thanks to Dan Smith for finding the main bug)

Changelog:
	June 11: Incorporate Paul Menage's feedback:  don't pass
	         NULL to ns_cgroup_clone from unshare, and reduce
		 patch size by using 'nodename' in cgroup_clone.
	June 10: Original version

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge Hallyn <serge@us.ibm.com>
Acked-by: Paul Menage <menage@google.com>
Tested-by: Dan Smith <danms@us.ibm.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:37 -07:00
Paul Menage
856c13aa1f cgroup files: convert res_counter_write() to be a cgroups write_string() handler
Currently res_counter_write() is a raw file handler even though it's
ultimately taking a number, since in some cases it wants to
pre-process the string when converting it to a number.

This patch converts res_counter_write() from a raw file handler to a
write_string() handler; this allows some of the boilerplate
copying/locking/checking to be removed, and simplies the cleanup path,
since these functions are now performed by the cgroups framework.

[lizf@cn.fujitsu.com: build fix]
Signed-off-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:36 -07:00
Paul Menage
84eea84288 cgroups: misc cleanups to write_string patchset
This patch contains cleanups suggested by reviewers for the recent
write_string() patchset:

- pair cgroup_lock_live_group() with cgroup_unlock() in cgroup.c for
  clarity, rather than directly unlocking cgroup_mutex.

- make the return type of cgroup_lock_live_group() a bool

- use a #define'd constant for the local buffer size in read/write functions

Signed-off-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:35 -07:00
Paul Menage
e788e066c6 cgroup files: move the release_agent file to use typed handlers
Adds cgroup_release_agent_write() and cgroup_release_agent_show()
methods to handle writing/reading the path to a cgroup hierarchy's
release agent. As a result, cgroup_common_file_read() is now unnecessary.

As part of the change, a previously-tolerated race in
cgroup_release_agent() is avoided by copying the current
release_agent_path prior to calling call_usermode_helper().

Signed-off-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:35 -07:00
Paul Menage
db3b14978a cgroup files: add write_string cgroup control file method
This patch adds a write_string() method for cgroups control files. The
semantics are that a buffer is copied from userspace to kernelspace
and the handler function invoked on that buffer.  The buffer is
guaranteed to be nul-terminated, and no longer than max_write_len
(defaulting to 64 bytes if unspecified). Later patches will convert
existing raw file write handlers in control group subsystems to use
this method.

Signed-off-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Balbir Singh <balbir@in.ibm.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:35 -07:00
Paul Menage
ce16b49d37 cgroup files: clean up whitespace in struct cftype
This patch removes some extraneous spaces from method declarations in
struct cftype, to fit in with conventional kernel style.

Signed-off-by: Paul Menage <menage@google.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:35 -07:00
Pavel Emelyanov
f2992db2a4 Mark res_counter_charge(_locked) with __must_check
Ignoring their return values may result in counter underflow in the future -
when the value charged will be uncharged (or in "leaks" - when the value is
not uncharged).

This also prevents from using charging routines to decrement the
counter value (i.e. uncharge it) ;)

(Current code works OK with res_counter, however :) )

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Paul Menage <menage@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:35 -07:00
Jan Kara
657d3bfa98 quota: implement sending information via netlink about user below quota
Sometimes it may be useful for userspace to know (e.g.  for some hosting
guys) that some user stopped exceeding his hardlimit or softlimit in
quotas.  Implement sending of such events to userspace via quota netlink
protocol so that they don't have to poll for such events.  Based on idea
and initial implementation by Vladislav Bogdanov.

Cc: Vladislav Bogdanov <slava@nsys.by>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:35 -07:00
Jan Kara
03b063436c quota: convert macros to inline functions
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:35 -07:00
Jan Kara
74abb9890d quota: move function-macros from quota.h to quotaops.h
Move declarations of some macros, which should be in fact functions to
quotaops.h.  This way they can be later converted to inline functions
because we can now use declarations from quota.h.  Also add necessary
includes of quotaops.h to a few files.

[akpm@linux-foundation.org: fix JFS build]
[akpm@linux-foundation.org: fix UFS build]
[vegard.nossum@gmail.com: fix QUOTA=n build]
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Arjen Pool <arjenpool@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:35 -07:00
Jan Kara
02a55ca871 quota: cleanup loop in sync_dquots()
Make loop in sync_dquots() checking whether there's something to write
more readable, remove useless variable and macro info_any_dirty() which
is used only in this place.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: "Vegard Nossum" <vegard.nossum@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:35 -07:00
Jan Kara
b85f4b87a5 quota: rename quota functions from upper case, make bigger ones non-inline
Cleanup quotaops.h: Rename functions from uppercase to lowercase (and
define backward compatibility macros), move larger functions to dquot.c
and make them non-inline.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:35 -07:00
Joe Peterson
b271e067c8 fatfs: add UTC timestamp option
Provide a new mount option ("tz=UTC") for DOS (vfat/msdos) filesystems,
allowing timestamps to be in coordinated universal time (UTC) rather than
local time in applications where doing this is advantageous.

In particular, portable devices that use fat/vfat (such as digital
cameras) can benefit from using UTC in their internal clocks, thus
avoiding daylight saving time errors and general time ambiguity issues.
The user of the device does not have to worry about changing the time when
moving from place or when daylight saving changes.

The new mount option, when set, disables the counter-adjustment that Linux
currently makes to FAT timestamp info in anticipation of the normal
userspace time zone correction.  When used in this new mode, all daylight
saving time and time zone handling is done in userspace as is normal for
many other filesystems (like ext3).  The default mode, which remains
unchanged, is still appropriate when mounting volumes written in Windows
(because of its use of local time).

I originally based this patch on one submitted last year by Paul Collins,
but I updated it to work with current source and changed variable/option
naming.  Ogawa Hirofumi (who maintains these filesystems) and I discussed
this patch at length on lkml, and he suggested using the option name in
the attached version of the patch.  Barry Bouwsma pointed out a good
addition to the patch as well.

Signed-off-by: Joe Peterson <joe@skyrush.com>
Signed-off-by: Paul Collins <paul@ondioline.org>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Barry Bouwsma <free_beer_for_all@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:34 -07:00
Adrian Bunk
e8938a62a8 remove unused #include <linux/dirent.h>'s
Remove some unused #include <linux/dirent.h>'s.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:34 -07:00
Adrian Bunk
cf6ae8b50e remove the in-kernel struct dirent{,64}
The kernel struct dirent{,64} were different from the ones in
userspace.

Even worse, we exported the kernel ones to userspace.

But after the fat usages are fixed we can remove the conflicting
kernel versions.

Reviewed-by: H. Peter Anvin <hpa@kernel.org>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:34 -07:00
Rene Scharfe
7557bc66be msdos fs: remove unsettable atari option
It has been impossible to set the option 'atari' of the MSDOS filesystem
for several years.  Since nobody seems to have missed it, let's remove its
remains.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:34 -07:00
OGAWA Hirofumi
4596c8aaf9 fat: fix VFAT_IOCTL_READDIR_xxx and cleanup for userland
"struct dirent" is a kernel type here, but is a **different type** in
userspace!  This means both the structure and the IOCTL number is wrong!

So, this adds new "struct __fat_dirent" to generate correct IOCTL number.
And kernel stuff moves to under __KERNEL__.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:34 -07:00
Jeff Mahoney
90415deac7 reiserfs: convert j_commit_lock to mutex
j_commit_lock is a semaphore but uses it as if it were a mutex.  This patch
converts it to a mutex.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Edward Shishkin <edward.shishkin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:33 -07:00
Jeff Mahoney
afe7025907 reiserfs: convert j_flush_sem to mutex
j_flush_sem is a semaphore but uses it as if it were a mutex.  This patch
converts it to a mutex.

[akpm@linux-foundation.org: fix mutex_trylock retval treatment]
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Edward Shishkin <edward.shishkin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:33 -07:00
Jeff Mahoney
f68215c464 reiserfs: convert j_lock to mutex
j_lock is a semaphore but uses it as if it were a mutex.  This patch converts
it to a mutex.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Edward Shishkin <edward.shishkin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:33 -07:00
Adrian Bunk
de0ca06a99 coda: remove CODA_FS_OLD_API
While fixing CONFIG_ leakages to the userspace kernel headers I ran into
CODA_FS_OLD_API.

After five years, are there still people using the old API left?
Especially considering that you have to choose at compile time which API
to support in the kernel (and distributions tend to offer the new API for
some time).

Jan: "The old API can definitely go.  Around the time the new
      interface went in there were some non-Coda userspace file system
      implementations that took a while longer to convert to the new API,
      but by now they all switched to the new interface or in some cases
      to a FUSE-based solution."

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:33 -07:00
Duane Griffin
ae76dd9a6b ext3: handle corrupted orphan list at mount
If the orphan node list includes valid, untruncatable nodes with nlink > 0
the ext3_orphan_cleanup loop which attempts to delete them will not do so,
causing it to loop forever. Fix by checking for such nodes in the
ext3_orphan_get function.

This patch fixes the second case (image hdb.20000009.softlockup.gz)
reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: printk warning fix]
Signed-off-by: Duane Griffin <duaneg@dghda.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:32 -07:00
Samuel Thibault
50c33a84db ext2: fix typo in Hurd part of include/linux/ext2_fs.h
Fix typo in Hurd part of include/linux/ext2_fs.h

The ';' here is redundant or can even pose problem.  This is actually not
used by the Linux kernel, but it is exposed in GNU/Hurd.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:31 -07:00
Eric Miao
bbcd6d543d gpio: max732x driver
This adds a driver supporting a family of I2C port expanders from Maxim,
which includes the MAX7319 and MAX7320-7327 chips.

[dbrownell@users.sourceforge.net: minor fixes]
Signed-off-by: Jack Ren <jack.ren@marvell.com>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:30 -07:00
Michael Buesch
7444a72eff gpiolib: allow user-selection
This patch adds functionality to the gpio-lib subsystem to make it
possible to enable the gpio-lib code even if the architecture code didn't
request to get it built in.

The archtitecture code does still need to implement the gpiolib accessor
functions in its asm/gpio.h file.  This patch adds the implementations for
x86 and PPC.

With these changes it is possible to run generic GPIO expansion cards on
every architecture that implements the trivial wrapper functions.  Support
for more architectures can easily be added.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: David Brownell <david-b@pacbell.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jean Delvare <khali@linux-fr.org>
Cc: Samuel Ortiz <sameo@openedhand.com>
Cc: Kumar Gala <galak@gate.crashing.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:30 -07:00
David Brownell
8f1cc3b10e gpio: mcp23s08 handles multiple chips per chipselect
Teach the mcp23s08 driver about a curious feature of these chips: up to
four of them can share the same chipselect, with the SPI signals wired in
parallel, by matching two bits in the first protocol byte against two
address lines on the chip.

This is handled by three software changes:

  * Platform data now holds an array of per-chip structs, not
    just one chip's address and pullup configuration.

  * Probe() and remove() now use another level of structure,
    wrapping an instance of the original structure for each
    mcp23s08 chip sharing that chipselect.

  * The HAEN bit is set, so that the hardware address bits can no
    longer be ignored (boot firmware may not have enabled them).

The "one struct per chip" preserves the guts of the current code,
but platform_data will need minor changes.

    OLD:
	/* incorrect "slave" ID may not have mattered */
	.slave = 3,
	.pullups = BIT(3) | BIT(1) | BIT(0),

    NEW:
	/* slave address _must_ match chip's wiring */
	.chip[3] = {
		.is_present = true,
		.pullups = BIT(3) | BIT(1) | BIT(0),
	},

There's no change in how things _behave_ for spi_device nodes with a
single mcp23s08 chip.  New multi-chip configurations assign GPIOs in
sequence, without holes.  The spi_device just resembles a bigger
controller, but internally it has multiple gpio_chip instances.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:30 -07:00
David Brownell
d8f388d8dc gpio: sysfs interface
This adds a simple sysfs interface for GPIOs.

    /sys/class/gpio
    	/export ... asks the kernel to export a GPIO to userspace
    	/unexport ... to return a GPIO to the kernel
        /gpioN ... for each exported GPIO #N
	    /value ... always readable, writes fail for input GPIOs
	    /direction ... r/w as: in, out (default low); write high, low
	/gpiochipN ... for each gpiochip; #N is its first GPIO
	    /base ... (r/o) same as N
	    /label ... (r/o) descriptive, not necessarily unique
	    /ngpio ... (r/o) number of GPIOs; numbered N .. N+(ngpio - 1)

GPIOs claimed by kernel code may be exported by its owner using a new
gpio_export() call, which should be most useful for driver debugging.
Such exports may optionally be done without a "direction" attribute.

Userspace may ask to take over a GPIO by writing to a sysfs control file,
helping to cope with incomplete board support or other "one-off"
requirements that don't merit full kernel support:

  echo 23 > /sys/class/gpio/export
	... will gpio_request(23, "sysfs") and gpio_export(23);
	use /sys/class/gpio/gpio-23/direction to (re)configure it,
	when that GPIO can be used as both input and output.
  echo 23 > /sys/class/gpio/unexport
	... will gpio_free(23), when it was exported as above

The extra D-space footprint is a few hundred bytes, except for the sysfs
resources associated with each exported GPIO.  The additional I-space
footprint is about two thirds of the current size of gpiolib (!).  Since
no /dev node creation is involved, no "udev" support is needed.

Related changes:

  * This adds a device pointer to "struct gpio_chip".  When GPIO
    providers initialize that, sysfs gpio class devices become children of
    that device instead of being "virtual" devices.

  * The (few) gpio_chip providers which have such a device node have
    been updated.

  * Some gpio_chip drivers also needed to update their module "owner"
    field ...  for which missing kerneldoc was added.

  * Some gpio_chips don't support input GPIOs.  Those GPIOs are now
    flagged appropriately when the chip is registered.

Based on previous patches, and discussion both on and off LKML.

A Documentation/ABI/testing/sysfs-gpio update is ready to submit once this
merges to mainline.

[akpm@linux-foundation.org: a few maintenance build fixes]
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Cc: Greg KH <greg@kroah.com>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:30 -07:00
Srinivasa D S
ef53d9c5e4 kprobes: improve kretprobe scalability with hashed locking
Currently list of kretprobe instances are stored in kretprobe object (as
used_instances,free_instances) and in kretprobe hash table.  We have one
global kretprobe lock to serialise the access to these lists.  This causes
only one kretprobe handler to execute at a time.  Hence affects system
performance, particularly on SMP systems and when return probe is set on
lot of functions (like on all systemcalls).

Solution proposed here gives fine-grain locks that performs better on SMP
system compared to present kretprobe implementation.

Solution:

 1) Instead of having one global lock to protect kretprobe instances
    present in kretprobe object and kretprobe hash table.  We will have
    two locks, one lock for protecting kretprobe hash table and another
    lock for kretporbe object.

 2) We hold lock present in kretprobe object while we modify kretprobe
    instance in kretprobe object and we hold per-hash-list lock while
    modifying kretprobe instances present in that hash list.  To prevent
    deadlock, we never grab a per-hash-list lock while holding a kretprobe
    lock.

 3) We can remove used_instances from struct kretprobe, as we can
    track used instances of kretprobe instances using kretprobe hash
    table.

Time duration for kernel compilation ("make -j 8") on a 8-way ppc64 system
with return probes set on all systemcalls looks like this.

cacheline              non-cacheline             Un-patched kernel
aligned patch 	       aligned patch
===============================================================================
real    9m46.784s       9m54.412s                  10m2.450s
user    40m5.715s       40m7.142s                  40m4.273s
sys     2m57.754s       2m58.583s                  3m17.430s
===========================================================

Time duration for kernel compilation ("make -j 8) on the same system, when
kernel is not probed.
=========================
real    9m26.389s
user    40m8.775s
sys     2m7.283s
=========================

Signed-off-by: Srinivasa DS <srinivasa@in.ibm.com>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:30 -07:00
Ben Dooks
42cd2366fb sm501: gpio I2C support
Add support for adding the GPIO based I2C resources.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <apatard@mandriva.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:30 -07:00
Arnaud Patard
60e540d617 sm501: gpio dynamic registration for PCI devices
The SM501 PCI card requires a dyanmic gpio allocation as the number of
cards is not known at compile time.  Fixup the platform data and
registration to deal with this.

Acked-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Arnaud Patard <apatard@mandriva.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:30 -07:00
Ben Dooks
f61be273d3 sm501: add gpiolib support
Add support for exporting the GPIOs on the SM501 via gpiolib.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <apatard@mandriva.com>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:29 -07:00
Ben Dooks
472dba7d11 sm501: add power control callback
Add callback to get or set the power control if the device has the sleep
connected to some form of GPIO.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Arnaud Patard <apatard@mandriva.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:29 -07:00
Dave Young
717115e1a5 printk ratelimiting rewrite
All ratelimit user use same jiffies and burst params, so some messages
(callbacks) will be lost.

For example:
a call printk_ratelimit(5 * HZ, 1)
b call printk_ratelimit(5 * HZ, 1) before the 5*HZ timeout of a, then b will
will be supressed.

- rewrite __ratelimit, and use a ratelimit_state as parameter.  Thanks for
  hints from andrew.

- Add WARN_ON_RATELIMIT, update rcupreempt.h

- remove __printk_ratelimit

- use __ratelimit in net_ratelimit

Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Dave Young <hidave.darkstar@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:29 -07:00
Vegard Nossum
2711b793eb kallsyms: unify 32- and 64-bit code
Use the %p format string which already accounts for the padding you need
with a pointer type on a particular architecture.

Also replace the macro with a static inline function to match the rest of
the file.

Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:29 -07:00
Arjan van de Ven
a8f18b909c Add a WARN() macro; this is WARN_ON() + printk arguments
Add a WARN() macro that acts like WARN_ON(), with the added feature that it
takes a printk like argument that is printed as part of the warning message.

[akpm@linux-foundation.org: fix printk arguments]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Greg KH <greg@kroah.com>
Cc: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:29 -07:00
Arjan van de Ven
b6c6393700 Rename WARN() to WARNING() to clear the namespace
We want to use WARN() as a variant of WARN_ON(), however a few drivers are
using WARN() internally.  This patch renames these to WARNING() to avoid the
namespace clash.  A few cases were defining but not using the thing, for those
cases I just deleted the definition.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Greg KH <greg@kroah.com>
Cc: Karsten Keil <kkeil@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:29 -07:00
Robert P. J. Day
4500d067ee init.h: remove obsolete content
Remove apparently obsolete content from init.h referring to gcc 2.9x
and to "no_module_init".

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:28 -07:00
FUJITA Tomonori
b69c49b784 clean up duplicated alloc/free_thread_info
We duplicate alloc/free_thread_info defines on many platforms (the
majority uses __get_free_pages/free_pages).  This patch defines common
defines and removes these duplicated defines.
__HAVE_ARCH_THREAD_INFO_ALLOCATOR is introduced for platforms that do
something different.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:28 -07:00
KOSAKI Motohiro
ac331d158e call_usermodehelper(): increase reliability
Presently call_usermodehelper_setup() uses GFP_ATOMIC.  but it can return
NULL _very_ easily.

GFP_ATOMIC is needed only when we can't sleep.  and, GFP_KERNEL is robust
and better.

thus, I add gfp_mask argument to call_usermodehelper_setup().

So, its callers pass the gfp_t as below:

call_usermodehelper() and call_usermodehelper_keys():
	depend on 'wait' argument.
call_usermodehelper_pipe():
	always GFP_KERNEL because always run under process context.
orderly_poweroff():
	pass to GFP_ATOMIC because may run under interrupt context.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: "Paul Menage" <menage@google.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:28 -07:00
Adrian Bunk
f16695f4ac asm-generic/int-ll64.h: always provide __{s,u}64
Several compilers offer "long long" without claiming to support C99.

Considering how frequent __s64/__u64 are used our userspace headers are
anyway unusable without __s64/__u64 available.

Always offer __s64/__u64 to non-gcc non-C99 compilers - if they provide
"long long" that makes the headers compiling and if they don't they are
anyway screwed.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:27 -07:00
Andrew Morton
cebbd3fb80 build-kernel-profileo-only-when-requested-cleanups
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:27 -07:00
Adrian Bunk
b03f6489f9 build kernel/profile.o only when requested
Build kernel/profile.o only if CONFIG_PROFILING is enabled.

This makes CONFIG_PROFILING=n kernels smaller.

As a bonus, some profile_tick() calls and one branch from schedule() are
now eliminated with CONFIG_PROFILING=n (but I doubt these are
measurable effects).

This patch changes the effects of CONFIG_PROFILING=n, but I don't think
having more than two choices would be the better choice.

This patch also adds the name of the first parameter to the prototypes
of profile_{hits,tick}() since I anyway had to add them for the dummy
functions.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:27 -07:00
Robert P. J. Day
e0ce0da9fe lists: remove a redundant conditional definition of list_add()
Remove the conditional surrounding the definition of list_add() from list.h
since, if you define CONFIG_DEBUG_LIST, the definition you will subsequently
pick up from lib/list_debug.c will be absolutely identical, at which point you
can remove that redundant definition from list_debug.c as well.

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:27 -07:00
Robert P. J. Day
b39c08cb69 Remove apparently unused fd1772.h header file.
This header file has been unused for quite some time, and the
corresponding source files appear to have been removed back in commit
99eb8a550d ("Remove the arm26 port")

Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Ian Molton <spyro@f2s.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:27 -07:00
Harvey Harrison
8b5ac31e27 include: use get/put_unaligned_* helpers
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:26 -07:00
Steven Rostedt
3f307891ce locking: add typecheck on irqsave and friends for correct flags
There haave been several areas in the kernel where an int has been used for
flags in local_irq_save() and friends instead of a long.  This can cause some
hard to debug problems on some architectures.

This patch adds a typecheck inside the irqsave and restore functions to flag
these cases.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:26 -07:00
Andrew Morton
e0deaff470 split the typecheck macros out of include/linux/kernel.h
Needed to fix up a recursive include snafu in
locking-add-typecheck-on-irqsave-and-friends-for-correct-flags.patch

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:26 -07:00
Yevgeny Petrilin
25c94d010a mlx4_core: Add VLAN tag field to WQE control segment struct
Add fields for VLAN tag and insert VLAN tag flag to the control
section struct.  These fields will be used for sending ethernet
packets.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-25 10:30:06 -07:00
Tony Luck
3e4d0cab61 [IA64] Wire up new system calls
Six new system calls: signalfd4, eventfd2, epoll_create1,
dup3, pipe2 and inotify_init1.

Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-07-25 10:10:28 -07:00
David Miller
3d6f4a20cc endian: Always evaluate arguments.
Changeset 7fa897b91a ("ide: trivial sparse
annotations") created an IDE bootup regression on big-endian systems.

In drivers/ide/ide-iops.c, function ide_fixstring() we now have the
loop:

		for (p = end ; p != s;)
			be16_to_cpus((u16 *)(p -= 2));

which will never terminate on big-endian because in such
a configuration be16_to_cpus() evaluates to "do { } while (0)"

Therefore, always evaluate the arguments to nop endian transformation
operations.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 09:28:09 -07:00
Alexey Korolev
3d45955962 [MTD] [NAND] subpage read feature as a way to increase performance.
This patch enables NAND subpage read functionality.
If upper layer drivers are requesting to read non page aligned data NAND
subpage-read functionality reads the only whose ECC regions which include
requested data when original code reads whole page.
This significantly improves performance in many cases.

Here are some digits :

UBI volume mount time
No subpage reads: 5.75 seconds
Subpage read patch: 2.42 seconds

Open/stat time for files on JFFS2 volume:
No subpage read  0m 5.36s
Subpage read     0m 2.88s

Signed-off-by Alexey Korolev <akorolev@infradead.org>
Acked-by: Artem Bityutskiy <dedekind@infradead.org>
Acked-by: Jörn Engel <joern@logfs.org>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2008-07-25 10:49:50 -04:00
David Woodhouse
ff877ea80e Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubi-2.6 2008-07-25 10:40:14 -04:00
Stefan Richter
95984f62c9 firewire: fw-ohci: TSB43AB22/A dualbuffer workaround
Isochronous reception in dualbuffer mode is reportedly broken with
TI TSB43AB22A on x86-64.  Descriptor addresses above 2G have been
determined as the trigger:
https://bugzilla.redhat.com/show_bug.cgi?id=435550

Two fixes are possible:
  - pci_set_consistent_dma_mask(pdev, DMA_31BIT_MASK);
    at least when IR descriptors are allocated, or
  - simply don't use dualbuffer.
This fix implements the latter workaround.

But we keep using dualbuffer on x86-32 which won't give us highmen (and
thus physical addresses outside the 31bit range) in coherent DMA memory
allocations.  Right now we could for example also whitelist PPC32, but
DMA mapping implementation details are expected to change there.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
2008-07-25 15:41:23 +02:00
Herbert Xu
6fccab671f ipsec: ipcomp - Merge IPComp implementations
This patch merges the IPv4/IPv6 IPComp implementations since most
of the code is identical.  As a result future enhancements will no
longer need to be duplicated.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-25 02:54:40 -07:00
Ingo Molnar
0e2f65ee30 Merge branch 'linus' into x86/pebs
Conflicts:

	arch/x86/Kconfig.cpu
	arch/x86/kernel/cpu/intel.c
	arch/x86/kernel/setup_64.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-25 11:37:07 +02:00
Tony Breeds
973b7d83eb powerpc: Wireup new syscalls
signalfd4, eventfd2, epoll_create1, dup3, pipe2 and inotify_init1

Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 16:40:55 +10:00
Jaswinder Singh
b2139aa0ee X86_SMP: tlb_XX.c declare smp_invalidate_interrupt before they get used
declare smp_invalidate_interrupt in asm-x86/hw_irq.h for X86_32 and X86_64

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-25 11:58:17 +05:30
Jaswinder Singh
e7f08dfdaa X86_SMP: smp.c declare functions before they get used
declared following smp interrupts in asm-x86/hw_irq.h:
smp_reschedule_interrupt, smp_call_function_interrupt, smp_call_function_single_interrupt

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-25 11:56:50 +05:30
Benjamin Herrenschmidt
1e3519f8e1 Move update_mmu_cache() declaration from tlbflush.h to pgtable.h
where it belongs. This fixes some build problems on some configs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 16:21:11 +10:00
Robert Jennings
a90ab95a95 powerpc/pseries: vio bus support for CMO
This is a large patch but the normal code path is not affected.  For
non-pSeries platforms the code is ifdef'ed out and for non-CMO enabled
pSeries systems this does not affect the normal code path.  Devices that
do not perform DMA operations do not need modification with this patch.
The function get_desired_dma was renamed from get_io_entitlement for
clarity.

Overview

Cooperative Memory Overcommitment (CMO) allows for a set of OS partitions
to be run with less RAM than the aggregate needs of the group of
partitions.  The firmware will balance memory between the partitions
and page in/out memory as needed.  Based on the number and type of IO
adpaters preset each partition is allocated an amount of memory for
DMA operations and this allocation will be guaranteed to the partition;
this is referred to as the partition's 'entitlement'.

Partitions running in a CMO environment can only have virtual IO devices
present.  The VIO bus layer will manage the IO entitlement for the system.
Accounting, at a system and per-device level, is tracked in the VIO bus
code and exposed via sysfs.  A set of dma_ops functions are added to
the bus to allow for this accounting.

Bus initialization

At initialization, the bus will calculate the minimum needs of the system
based on providing each device present with a standard minimum entitlement
along with a spare allocation for the bus to handle hotplug events.
If the minimum needs can not be met the system boot will be halted.

Device changes

The significant changes for devices while running under CMO are that the
devices must specify how much dedicated IO entitlement they desire and
must also handle DMA mapping errors that can occur due to constrained
IO memory.  The virtual IO drivers are modified to silence errors when
DMA mappings fail for CMO and handle these failures gracefully.

Each devices will be guaranteed a minimum entitlement that can always
be mapped.  Devices will specify how much entitlement they desire and
the VIO bus will attempt to provide for this.  Devices can change their
desired entitlement level at any point in time to address particular needs
(via vio_cmo_set_dev_desired()), not just at device probe time.

VIO bus changes

The system will have a particular entitlement level available from which
it can provide memory to the devices.  The bus defines two pools of memory
within this entitlement, the reserved and excess pools.  Each device is
provided with it's own entitlement no less than a system defined minimum
entitlement and no greater than what the device has specified as it's
desired entitlement.  The entitlement provided to devices comes from the
reserve pool.  The reserve pool can also contain a spare allocation as
large as the system defined minimum entitlement which is used for device
hotplug events.  Any entitlement not needed to fulfill the needs of a
reserve pool is placed in the excess pool.  Each device is guaranteed
that it can map up to it's entitled level; additional mapping are possible
as long as there is unmapped memory in the excess pool.

Bus probe

As the system starts, each device is given an entitlement equal only
to the system defined minimum entitlement.  The reserve pool is equal
to the sum of these entitlements, plus a spare allocation.  The VIO bus
also tracks the aggregate desired entitlement of all the devices.  If the
system desired entitlement is greater than the size of the reserve pool,
when devices unmap IO memory it will be reserved and a balance operation
will be scheduled for some time in the future.

Entitlement balancing

The balance function tries to fairly distribute entitlement between the
devices in the system with the goal of providing each device with it's
desired amount of entitlement.  Devices using more than what would be
ideal will have their entitled set-point adjusted; this will effectively
set a goal for lower IO memory usage as future mappings can fail and
deallocations will trigger a balance operation to distribute the newly
unmapped memory.  A fair distribution of entitlement can take several
balance operations to achieve.  Entitlement changes and device DLPAR
events will alter the state of CMO and will trigger balance operations.

Hotplug events

The VIO bus allows for changes in system entitlement at run-time via
'vio_cmo_entitlement_update()'.  When devices are added the hotplug
device event will be preceded by a system entitlement increase and this
is reversed when devices are removed.

The following changes are made that the VIO bus layer for CMO:
 * add IO memory accounting per device structure.
 * add IO memory entitlement query function to driver structure.
 * during vio bus probe, if CMO is enabled, check that driver has
   memory entitlement query function defined.  Fail if function not defined.
 * fail to register driver if io entitlement function not defined.
 * create set of dma_ops at vio level for CMO that will track allocations
   and return DMA failures once entitlement is reached.  Entitlement will
   limited by overall system entitlement.  Devices will have a reserved
   quantity of memory that is guaranteed, the rest can be used as available.
 * expose entitlement, current allocation, desired allocation, and the
   allocation error counter for devices to the user through sysfs
 * provide mechanism for changing a device's desired entitlement at run time
   for devices as an exported function and sysfs tunable
 * track any DMA failures for entitled IO memory for each vio device.
 * check entitlement against available system entitlement on device add
 * track entitlement metrics (high water mark, current usage)
 * provide function to reset high water mark
 * provide minimum and desired entitlement numbers at a bus level
 * provide drivers with a minimum guaranteed entitlement
 * balance available entitlement between devices to satisfy their needs
 * handle system entitlement changes and device hotplug

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 15:44:43 +10:00
Robert Jennings
6490c4903d powerpc/pseries: iommu enablement for CMO
To support Cooperative Memory Overcommitment (CMO), we need to check
for failure from some of the tce hcalls.

These changes for the pseries platform affect the powerpc architecture;
patches for the other affected platforms are included in this patch.

pSeries platform IOMMU code changes:
 * platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and
   return an error.

Architecture IOMMU code changes:
 * Calls to ppc_md.tce_build need to check return values and return
   DMA_MAPPING_ERROR for transient errors.

Architecture changes:
 * struct machdep_calls for tce_build*_pSeriesLP functions need to change
   to indicate failure.
 * all other platforms will need updates to iommu functions to match the new
   calling semantics; they will return 0 on success.  The other platforms
   default configs have been built, but no further testing was performed.

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 15:44:43 +10:00
Brian King
ffa5abbd0c powerpc/pseries: Add CMO paging statistics
With the addition of Cooperative Memory Overcommitment (CMO) support
for IBM Power Systems, two fields have been added to the VPA to report
paging statistics.  Add support in lparcfg to report them to userspace.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 15:44:42 +10:00
Brian King
86630a3232 powerpc/pseries: Utilities to set firmware page state
Newer versions of firmware support page states, which are used by the
collaborative memory manager (future patch) to "loan" pages to the
hypervisor for use by other partitions.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 15:44:42 +10:00
Robert Jennings
e46de429cb powerpc/pseries: Enable CMO feature during platform setup
For Cooperative Memory Overcommitment (CMO), set the FW_FEATURE_CMO
flag in powerpc_firmware_features from the rtas ibm,get-system-parameters
table prior to calling iommu_init_early_pSeries.

With this, any CMO specific functionality can be controlled by checking:
 firmware_has_feature(FW_FEATURE_CMO)

Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 15:44:42 +10:00
Nathan Fontenot
dfc3403f0e powerpc/pseries: Add memory entitlement capabilities to /proc/ppc64/lparcfg
Update /proc/ppc64/lparcfg to display Cooperative Memory
Overcommitment statistics as reported by the H_GET_MPP hcall.  This
also updates the lparcfg interface to allow setting memory entitlement
and weight.

Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com>
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 15:44:41 +10:00
Luis Machado
d6a61bfc06 powerpc: BookE hardware watchpoint support
This patch implements support for HW based watchpoint via the
DBSR_DAC (Data Address Compare) facility of the BookE processors.

It does so by interfacing with the existing DABR breakpoint code
and adding the necessary bits and pieces for the new bits to
be properly set or cleared

Signed-off-by: Luis Machado <luisgpm@br.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 15:44:39 +10:00
Nathan Lynch
9115d13453 powerpc: Enable AT_BASE_PLATFORM aux vector
Stash the first platform string matched by identify_cpu() in
powerpc_base_platform, and supply that to the ELF loader for the value
of AT_BASE_PLATFORM.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 15:44:39 +10:00
Nathan Lynch
483fad1c3f ELF loader support for auxvec base platform string
Some IBM POWER-based platforms have the ability to run in a
mode which mostly appears to the OS as a different processor from the
actual hardware.  For example, a Power6 system may appear to be a
Power5+, which makes the AT_PLATFORM value "power5+".  This means that
programs are restricted to the ISA supported by Power5+;
Power6-specific instructions are treated as illegal.

However, some applications (virtual machines, optimized libraries) can
benefit from knowledge of the underlying CPU model.  A new aux vector
entry, AT_BASE_PLATFORM, will denote the actual hardware.  For
example, on a Power6 system in Power5+ compatibility mode, AT_PLATFORM
will be "power5+" and AT_BASE_PLATFORM will be "power6".  The idea is
that AT_PLATFORM indicates the instruction set supported, while
AT_BASE_PLATFORM indicates the underlying microarchitecture.

If the architecture has defined ELF_BASE_PLATFORM, copy that value to
the user stack in the same manner as ELF_PLATFORM.

Signed-off-by: Nathan Lynch <ntl@pobox.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-25 15:44:39 +10:00
Benjamin Herrenschmidt
c174aff956 Merge commit 'gcl/gcl-next' 2008-07-25 15:35:03 +10:00
Jaswinder Singh
461d159694 x86_64: Declare new_utsname in asm-x86/syscalls.h
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-25 09:19:05 +05:30
Linus Torvalds
832fe9c222 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  virtio: Add transport feature handling stub for virtio_ring.
  virtio: Rename set_features to finalize_features
  virtio: Formally reserve bits 28-31 to be 'transport' features.
  s390: use virtio_console for KVM on s390
  virtio: console as a config option
  virtio_console: use virtqueue notification for hvc_console
  hvc_console: rework setup to replace irq functions with callbacks
  virtio_blk: check for hardsector size from host
  virtio: Use bus_type probe and remove methods
  virtio: don't always force a notification when ring is full
  virtio: clarify that ABI is usable by any implementations
  virtio: Recycle unused recv buffer pages for large skbs in net driver
  virtio net: Allow receiving SG packets
  virtio net: Add ethtool ops for SG/GSO
  virtio: fix virtio_net xmit of freed skb bug
2008-07-24 19:11:49 -07:00
Rusty Russell
ed9559d38a Label kthread_create() with printf attribute tag.
Obvious misc patch been in my queue (& linux-next) for over a cycle.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 19:11:15 -07:00
Rusty Russell
e34f872567 virtio: Add transport feature handling stub for virtio_ring.
To prepare for virtio_ring transport feature bits, hook in a call in
all the users to manipulate them.  This currently just clears all the
bits, since it doesn't understand any features.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-25 12:06:14 +10:00
Rusty Russell
c624896e48 virtio: Rename set_features to finalize_features
Rather than explicitly handing the features to the lower-level, we just
hand the virtio_device and have it set the features.  This make it clear
that it has the chance to manipulate the features of the device at this
point (and that all feature negotiation is already done).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-25 12:06:12 +10:00
Rusty Russell
dd7c7bc462 virtio: Formally reserve bits 28-31 to be 'transport' features.
We assign feature bits as required, but it makes sense to reserve some
for the particular transport, rather than the particular device.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-25 12:06:07 +10:00
Christian Borntraeger
faeba830b0 s390: use virtio_console for KVM on s390
This patch enables virtio_console as the default console on kvm for
s390. We currently use the same notify hack as lguest for early
console output. I will try to address this for lguest and s390 later.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-25 12:06:07 +10:00
Christian Borntraeger
066f4d82a6 virtio_blk: check for hardsector size from host
Currently virtio_blk assumes a 512 byte hard sector size. This can cause
trouble / performance issues if the backing has a different block size
(like a file on an ext3 file system formatted with 4k block size or a dasd).

Lets add a feature flag that tells the guest to use a different hard sector
size than 512 byte.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-25 12:06:05 +10:00
Rusty Russell
674bfc23c5 virtio: clarify that ABI is usable by any implementations
We want others to implement and use virtio, so it makes sense to BSD
license the non-__KERNEL__ parts of the headers to make this crystal
clear.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Mark McLoughlin <markmc@redhat.com>
Acked-by: Ryan Harper <ryanh@us.ibm.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
2008-07-25 12:06:04 +10:00
Linus Torvalds
b30f3ae50c x86-64: Clean up 'save/restore_i387()' usage
Suresh Siddha wants to fix a possible FPU leakage in error conditions,
but the fact that save/restore_i387() are inlines in a header file makes
that harder to do than necessary.  So start off with an obvious cleanup.

This just moves the x86-64 version of save/restore_i387() out of the
header file, and moves it to the only file that it is actually used in:
arch/x86/kernel/signal_64.c.  So exposing it in a header file was wrong
to begin with.

[ Side note: I'd like to fix up some of the games we play with the
  32-bit version of these functions too, but that's a separate
  matter.  The 32-bit versions are shared - under different names
  at that! - by both the native x86-32 code and the x86-64 32-bit
  compatibility code ]

Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 16:12:40 -07:00
Linus Torvalds
b5684b83b1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
  ide: use proper printk() KERN_* levels in ide-probe.c
  ide: fix for EATA SCSI HBA in ATA emulating mode
  ide: remove stale comments from drivers/ide/Makefile
  ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase
  ide-scsi: remove kmalloced struct request
  ht6560b: remove old history
  ht6560b: update email address
  ide-cd: fix oops when using growisofs
  gayle: release resources on ide_host_add() failure
  palm_bk3710: add UltraDMA/100 support
  ide: trivial sparse annotations
  ide: ide-tape.c sparse annotations and unaligned access removal
  ide: drop 'name' parameter from ->init_chipset method
  ide: prefix messages from IDE PCI host drivers by driver name
  it821x: remove DECLARE_ITE_DEV() macro
  it8213: remove DECLARE_ITE_DEV() macro
  ide: include PCI device name in messages from IDE PCI host drivers
  ide: remove <asm/ide.h> for some archs
  ide-generic: remove ide_default_{io_base,irq}() inlines (take 3)
  ide-generic: is no longer needed on ppc32
  ...
2008-07-24 14:55:09 -07:00
Linus Torvalds
5042d99795 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: fixup sparse endianness warnings in proc.c
  PCI PM: make more PCI PM core functionality available to drivers
  PCI/DMAR: don't assume presence of RMRRs
  PCI hotplug: fix error path in pci_slot's register_slot
2008-07-24 13:57:13 -07:00
Bartlomiej Zolnierkiewicz
a326b02b0c ide: drop 'name' parameter from ->init_chipset method
There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:33 +02:00
Bartlomiej Zolnierkiewicz
2a8f7450f8 ide: remove <asm/ide.h> for some archs
* Remove <linux/irq.h> include from <asm-ia64.h> (<linux/ide.h> includes
  <linux/interrupt.h> which is enough).

* Remove <asm/ide.h> for alpha/blackfin/h8300/ia64/m32r/sh/x86/xtensa
  (this leaves us with arm/frv/m68k/mips/mn10300/parisc/powerpc/sparc[64]).

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:31 +02:00
Bartlomiej Zolnierkiewicz
f01d35d87f ide-generic: remove ide_default_{io_base,irq}() inlines (take 3)
Replace ide_default_{io_base,irq}() inlines by legacy_{bases,irqs}[].

v2:
Add missing zero-ing of hws[] (caught during testing by Borislav Petkov).

v3:
Fix zero-oing of hws[] for _real_ this time.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:31 +02:00
Bartlomiej Zolnierkiewicz
ffed0b6e1a ide-generic: remove broken PPC_PREP support
PPC_PREP has been depending on BROKEN for some time now.

Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:30 +02:00
Bartlomiej Zolnierkiewicz
d83b8b85cd ide: define MAX_HWIFS in <linux/ide.h>
* Now that ide_hwif_t instances are allocated dynamically
  the difference between MAX_HWIFS == 2 and MAX_HWIFS == 10
  is ~100 bytes (x86-32) so use MAX_HWIFS == 10 on all archs
  except these ones that use MAX_HWIFS == 1.

* Define MAX_HWIFS in <linux/ide.h> instead of <asm/ide.h>.

[ Please note that avr32/cris/v850 have no <asm/ide.h>
  and alpha/ia64/sh always define CONFIG_IDE_MAX_HWIFS. ]

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:30 +02:00
Bartlomiej Zolnierkiewicz
2c9d86438a ide: remove <asm-cris/ide.h>
Remove <asm-cris/arch-v{10,32}/ide.h> and <asm-cris/ide.h>.

This has been a broken code for some time now and needs rewrite
to match IDE core code / host driver model anyway.

Cc: Jesper Nilsson <Jesper.Nilsson@axis.com>
Cc: Mikael Starvik <mikael.starvik@axis.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:29 +02:00
Bartlomiej Zolnierkiewicz
b6cd7da5be ide-generic: remove "no_pci_devices()" quirk from ide_default_io_base()
Since the decision to probe for ISA ide2-6 is now left to the user
"no_pci_devices()" quirk is no longer needed and may be removed.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:28 +02:00
Bartlomiej Zolnierkiewicz
dbdec839c4 ide-generic: minor fix for mips
Move ide_probe_legacy() call to ide_generic_init() so it fails
early if necessary and returns the proper error value (nowadays
ide_default_io_base() is used only by ide-generic).

Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:28 +02:00
Bartlomiej Zolnierkiewicz
ac32f3238c ide-generic: fix ide_default_io_base() for m32r
Fix ide_default_io_base() to match ide_default_irq().

Cc: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:27 +02:00
Bartlomiej Zolnierkiewicz
b0a6281796 ide: fix <asm-xtensa/ide.h>
* Add missing <asm-generic/ide_iops.h> include.

While at it:

* Remove needless ide_default_{irq,io_base}() inlines.

Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:27 +02:00
Bartlomiej Zolnierkiewicz
ef0b04276d ide: add ide_pci_remove() helper
* Add 'unsigned long host_flags' field to struct ide_host.

* Set ->host_flags in ide_host_alloc_all().

* Always set PCI dev's ->driver_data in ide_pci_init_{one,two}().

* Add ide_pci_remove() helper (the default implementation for
  struct pci_driver's ->remove method).

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:19 +02:00
Bartlomiej Zolnierkiewicz
08da591e14 ide: add ide_device_{get,put}() helpers
* Add 'struct ide_host *host' field to ide_hwif_t and set it
  in ide_host_alloc_all().

* Add ide_device_{get,put}() helpers loosely based on SCSI's
  scsi_device_{get,put}() ones.

* Convert IDE device drivers to use ide_device_{get,put}().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:15 +02:00
Bartlomiej Zolnierkiewicz
6cdf6eb357 ide: add ->dev and ->host_priv fields to struct ide_host
* Add 'struct device *dev[2]' and 'void *host_priv' fields
  to struct ide_host.

* Set ->dev[] in ide_host_alloc_all()/ide_setup_pci_device[s]().

* Pass 'void *priv' argument to ide_setup_pci_device[s]()
  and use it to set ->host_priv.

* Set PCI dev's ->driver_data to point to the struct ide_host
  instance if PCI host driver wants to use ->host_priv.

* Rename ide_setup_pci_device[s]() to ide_pci_init_{one,two}().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-24 22:53:14 +02:00
Linus Torvalds
5c402355ad Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  MAINTAINERS: Remove Glenn Streiff from NetEffect entry
  mlx4_core: Improve error message when not enough UAR pages are available
  IB/mlx4: Add support for memory management extensions and local DMA L_Key
  IB/mthca: Keep free count for MTT buddy allocator
  mlx4_core: Keep free count for MTT buddy allocator
  mlx4_code: Add missing FW status return code
  IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg
  mlx4_core: Add module parameter to enable QoS support
  RDMA/iwcm: Remove IB_ACCESS_LOCAL_WRITE from remote QP attributes
  IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures
  IB/sa_query: Check if sm_ah is NULL in ib_sa_remove_one()
  IB/ehca: Release mutex in error path of alloc_small_queue_page()
  IB/ehca: Use default value for Local CA ACK Delay if FW returns 0
  IB/ehca: Filter PATH_MIG events if QP was never armed
  IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event
  RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event
  RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event
2008-07-24 12:56:07 -07:00
Linus Torvalds
ecc8b655b3 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  nohz: adjust tick_nohz_stop_sched_tick() call of s390 as well
  nohz: prevent tick stop outside of the idle loop
2008-07-24 12:55:01 -07:00
Linus Torvalds
6044110742 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix header export, asm-x86/processor-flags.h, CONFIG_* leaks
  x86: BUILD_IRQ say .text to avoid .data.percpu
  xen: don't use sysret for sysexit32
  x86: call early_cpu_init at the same point
2008-07-24 12:33:51 -07:00
Linus Torvalds
7540081c6b Merge branch 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc
* 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
  Remove __DECLARE_SEMAPHORE_GENERIC
  Remove asm/semaphore.h
  Remove use of asm/semaphore.h
  Add missing semaphore.h includes
  Remove mention of semaphores from kernel-locking
2008-07-24 12:24:40 -07:00
Linus Torvalds
3fde80e94c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68knommu: put ColdFire head code into .text.head section
  m68knommu: remove last use of CONFIG_FADS and CONFIG_RPXCLASSIC
  m68knommu: remove RPXCLASSIC from the m68k tree
  m68knommu: fec: remove FADS
  m68knommu: MCF5307 PIT GENERIC_CLOCKEVENTS support
  m68knommu: add read_barrier_depends() and irqs_disabled_flags()
  m68knommu: add byteswap assembly opcode for ISA A+
  m68knommu: add ffs and __ffs plattform which support ISA A+ or ISA C
  m68knommu: add sched_clock() for the DMA timer
  m68knommu: complete generic time
  m68knommu: move code within time.c
  m68knommu: m68knommu: add old stack trace method
  m68knommu: Add Coldfire DMA Timer support
  m68knommu: defconfig for M5407C3 board
  m68knommu: defconfig for M5307C3 board
  m68knommu: defconfig for M5275EVB board
  m68knommu: defconfig for M5249EVB board
  m68knommu: change to a configs directory for board configurations
2008-07-24 12:17:19 -07:00
Linus Torvalds
c54554d388 Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds
* 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds:
  leds: Ensure led->trigger is set earlier
  leds: Add support for Philips PCA955x I2C LED drivers
  leds: Fix sparse warnings in leds-h1940 driver
  leds: mark led_classdev.default_trigger as const
  leds: fix unsigned value overflow in atmel pwm driver
  leds: Add pca9532 platform data for Thecus N2100
  leds: Add pca9532 led driver
2008-07-24 12:16:02 -07:00
Steven Whitehouse
f9247273cb UFS: add const to parser token table
This patch adds a "const" to the parser token table. I've done an
allmodconfig build to see if this produces any warnings/failures and the
patch includes a fix for the only warning that was produced.

Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Alexander Viro <aviro@redhat.com>
Acked-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 11:50:15 -07:00
Philippe De Muyter
5bb49fcd50 video/fb: cleanup FB_MAJOR usage
Currently, linux/major.h defines a GRAPHDEV_MAJOR (29) that nobody uses,
and linux/fb.h defines the real FB_MAJOR (also 29), that only fbmem.c
needs.  Drop GRAPHDEV_MAJOR from major.h, move FB_MAJOR definition from
fb.h to major.h, and fix fbmem.c to use major.h's definition.

Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:41 -07:00
Hans-Christian Egtvedt
3e074058d7 fbdev: LCD backlight driver using Atmel PWM driver
This patch adds a platform driver using the ATMEL PWM driver to control a
backlight which requires a PWM signal and optional GPIO signal for discrete
on/off signal.  It has been tested on Favr-32 board from EarthLCD.

The driver is configurable by supplying a struct with the platform data.  See
the include/linux/atmel-pwm-bl.h for details.

The board code for Favr-32 will be submitted to the AVR32 kernel list.

Signed-off-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:41 -07:00
Nobuhiro Iwamatsu
4a25e41831 video: sh7760fb: SH7760/SH7763 LCDC framebuffer driver
Framebuffer driver for the SH7760/SH7763 integrated LCD controller.

Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Nobuhiro Iwamatsu <iwamatsu.nobuhiro@renesas.com>
Reviewed-by: Paul Mundt <lethal@linux-sh.org>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Cc: Siegfried Schaefer <s.schaefer@schaefer-edv.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:41 -07:00
Krzysztof Helt
c6b044d6ba neofb: drop the xtimings structure
Remove the xtimings structure which only stored some values to be used
later (mostly once).  Calculate and use these values in places they are
needed.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:41 -07:00
Ben Dooks
c25826a7cf lcd: add platform_lcd driver
Add a platform_lcd driver to allow boards with simple lcd power controls
to register themselves easily.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:40 -07:00
Ben Dooks
0c531360ed lcd: add lcd_device to check_fb() entry in lcd_ops
Add the lcd_device being checked to the check_fb entry of lcd_ops.  This
ensures that any driver using this to check against it's own state can do
so, and also makes all the calls in lcd_ops more orthogonal in their
arguments.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:40 -07:00
Ben Dooks
cccb6d3c14 fb: add support for the ILI9320 video display controller
Provide support for the ILI9320 display controller chip which is found in
many LCD displays.  Included with this is support for an example LCD using
this chip, the VGG2432A4.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:40 -07:00
Ben Dooks
206c5d69d0 sm501: add inversion controls for VBIASEN and FPEN
Add flags to allow the driver to invert the sense of both VBIASEN and FPEN
signals comming from the SM501.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:40 -07:00
Magnus Damm
cfb4f5d175 fbdev: SuperH Mobile LCDC Driver
This is the SuperH Mobile LCDC frame buffer driver V2, adding support for
the LCDC block found in SuperH Mobile processors.  The hardware supports
up to two LCD panels per LCDC block, and both RGB and SYS interfaces can
be used to hook up LCD panels/modules.

The device driver is a regular platform driver, so LCD configuration and
board specific hooks are passed to the driver using platform data.  LCD
modules using SYS interface often require special configuration using the
SYS bus, and to solve this cleanly the driver provides SYS interface
operations to the board code.

Tested on sh7723 and sh7722 processors with a SYS16A QVGA panel and WVGA
panels using RGB16 and RGB18 interfaces.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Reviewed-by: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:38 -07:00
Nicolas Ferre
d22579b837 atmel_lcdfb: FIFO underflow management
Manage atmel_lcdfb FIFO underflow

Resetting the LCD and DMA allows to fix screen shifting after a FIFO
underflow.  It follows reset sequence from errata "LCD Screen Shifting
After a Reset".

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Andrew Victor <linux@maxim.org.za>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:37 -07:00
Krzysztof Helt
0292be4a38 tridentfb: add imageblit acceleration for Blade3D family
Add imageblit acceleration for the Blade3D family of cores.  The code is
based on code from the cyblafb driver.

It is a step toward assimilating back the cyblafb driver into the
tridentfb driver.  The cyblafb driver handles a subfamily of the Trident
Blade3d cores.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:36 -07:00
Krzysztof Helt
5cf138457a tridentfb: source code improvements
This patch contains general source code improvments:
 - more simple functions are inline
 - removes some meaningless output and the VERSION
   string as it is no use
 - eng_par is moved into the tridentfb_par
 - removed small section of code for CyberBladeXPAi1
   which is maybe right for only one resolution
   and refresh rate and is probably redundant now
 - other minor improvements

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:36 -07:00
Krzysztof Helt
01a2d9ed85 tridentfb: acceleration constants change
This patch replaces deprecated constant FB_ACCELF_TEXT with
FBINFO_HWACCEL_DISABLED and adds constants for Trident families of
accelerators.

The FBINFO_HWACCEL_DISABLED is correctly used so noaccel parameter works
now.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:36 -07:00
Krzysztof Helt
49b1f4b44b tridentfb: acceleration code improvements
This patch brings various acceleration improvements:
- set  copyarea/fillrect for non-accelerated framebuffer (fix)
- remove 15 bpp depth handling to simplify code as it hardly
  works (15 bpp handling was obviously missing in some switches)
- add fb_sync call and move waiting before accelerated function
  to make acceleration more asynchronous to cpu (few % of speed
  improvement)
- add cpu_relax() call in waiting loops
- make longer register names and name more registers
- move registers' definition to header
- general code improvements (shortening, simplifying)

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:36 -07:00
Krzysztof Helt
a0d922562d tridentfb: add TGUI 9440 support
Add support for TGUI 9440 chip.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:35 -07:00
Krzysztof Helt
0e73a47f09 tridentfb: improved register values on TGUI 9680
Improved values for some registers after Xorg Trident driver.  The main
problem was that values set by BIOS have been ignored.

This patch completely remove random pixels ("snow") on the TGUI 9680 and
9440 (not supported yet by the driver).  It does not help with the "snow"
on 3DImage and Blade3D cards.

There is also small improvement in timing calculations (hblank start and
vblank start)

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:35 -07:00
Krzysztof Helt
10172ed6dc tridentfb: make use of functions and constants from the vga.h
Make use of functions and constants from the vga.h header to compact the code
and make it more readable.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:35 -07:00
Krzysztof Helt
e0759a5fbb tridentfb: convert is_blade and is_xp macros into functions
This patch converts the is_blade() and is_xp() macros into local functions.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:35 -07:00
Krzysztof Helt
6eed8e1ec8 tridentfb: move global flat panel variable into structure
This patch moves flat panel indicator into tridentfb_par structure and removes
related global variables and macros.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:35 -07:00
David Brownell
d3de851a44 rtc: BCD codeshrink
This updates <linux/bcd.h> to define the key routines as constant
functions, which the macros will then call.  Newer code can now call
bcd2bin() instead of SCREAMING BCD2BIN() TO THE FOUR WINDS.

This lets each driver shrink their codespace by using N function calls to
a single (global) copy of those routines, instead of N inlined copies of
these functions per driver.

These routines aren't used in speed-critical code.  Almost all callers are
in the RTC framework.  Typical per-driver savings is near 300 bytes.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Adrian Bunk <bunk@kernel.org>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:33 -07:00
David Brownell
53e84b672c rtc: ds1305/ds1306 driver
Support the Dallas/Maxim DS1305 and DS1306 RTC chips.  These use SPI, and
support alarms, NVRAM, and a trickle charger for use when their backup
power supply is a supercap or rechargeable cell.

This basic driver doesn't yet support suspend/resume or wakealarms.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:33 -07:00
David Brownell
5ad31a5751 rtc: remove BKL for ioctl()
Remove implicit use of BKL in ioctl() from the RTC framework.

Instead, the rtc->ops_lock is used.  That's the same lock that already
protects the RTC operations when they're issued through the exported
rtc_*() calls in drivers/rtc/interface.c ...  making this a bugfix, not
just a cleanup, since both ioctl calls and set_alarm() need to update IRQ
enable flags and that implies a common lock (which RTC drivers as a rule
do not provide on their own).

A new comment at the declaration of "struct rtc_class_ops" summarizes
current locking rules.  It's not clear to me that the exceptions listed
there should exist ...  if not, those are pre-existing problems which can
be fixed in a patch that doesn't relate to BKL removal.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Jonathan Corbet <corbet@lwn.net>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:33 -07:00
Ian Kent
aa55ddf340 autofs4: remove unused ioctls
The ioctls AUTOFS_IOC_TOGGLEREGHOST and AUTOFS_IOC_ASKREGHOST were added
several years ago but what they were intended for has never been
implemented (as far as I'm aware noone uses them) so remove them.

Signed-off-by: Ian Kent <raven@themaw.net>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:33 -07:00
Manuel Lauss
3a93a159c6 spi: au1550_spi: proper platform device
Remove the Au1550 resource table and instead extract MMIO/IRQ/DMA
resources from platform resource information like any well-behaved
platform driver.

Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Jan Nikitenko <jan.nikitenko@gmail.com>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:30 -07:00
Grant Likely
102eb97564 spi: make spi_board_info.modalias a char array
Currently, 'modalias' in the spi_device structure is a 'const char *'.
The spi_new_device() function fills in the modalias value from a passed in
spi_board_info data block.  Since it is a pointer copy, the new spi_device
remains dependent on the spi_board_info structure after the new spi_device
is registered (no other fields in spi_device directly depend on the
spi_board_info structure; all of the other data is copied).

This causes a problem when dynamically propulating the list of attached
SPI devices.  For example, in arch/powerpc, the list of SPI devices can be
populated from data in the device tree.  With the current code, the device
tree adapter must kmalloc() a new spi_board_info structure for each new
SPI device it finds in the device tree, and there is no simple mechanism
in place for keeping track of these allocations.

This patch changes modalias from a 'const char *' to a fixed char array.
By copying the modalias string instead of referencing it, the dependency
on the spi_board_info structure is eliminated and an outside caller does
not need to maintain a separate spi_board_info allocation for each device.

If searched through the code to the best of my ability for any references
to modalias which may be affected by this change and haven't found
anything.  It has been tested with the lite5200b platform in arch/powerpc.

[dbrownell@users.sourceforge.net: cope with linux-next changes: KOBJ_NAME_LEN obliterated, etc]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:30 -07:00
Ulrich Drepper
9fe5ad9c8c flag parameters add-on: remove epoll_create size param
Remove the size parameter from the new epoll_create syscall and renames the
syscall itself.  The updated test program follows.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_epoll_create2
# ifdef __x86_64__
#  define __NR_epoll_create2 291
# elif defined __i386__
#  define __NR_epoll_create2 329
# else
#  error "need __NR_epoll_create2"
# endif
#endif

#define EPOLL_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd = syscall (__NR_epoll_create2, 0);
  if (fd == -1)
    {
      puts ("epoll_create2(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("epoll_create2(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_epoll_create2, EPOLL_CLOEXEC);
  if (fd == -1)
    {
      puts ("epoll_create2(EPOLL_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:29 -07:00
Ulrich Drepper
510df2dd48 flag parameters: NONBLOCK in inotify_init
This patch adds non-blocking support for inotify_init1.  The
additional changes needed are minimal.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_inotify_init1
# ifdef __x86_64__
#  define __NR_inotify_init1 294
# elif defined __i386__
#  define __NR_inotify_init1 332
# else
#  error "need __NR_inotify_init1"
# endif
#endif

#define IN_NONBLOCK O_NONBLOCK

int
main (void)
{
  int fd = syscall (__NR_inotify_init1, 0);
  if (fd == -1)
    {
      puts ("inotify_init1(0) failed");
      return 1;
    }
  int fl = fcntl (fd, F_GETFL);
  if (fl == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (fl & O_NONBLOCK)
    {
      puts ("inotify_init1(0) set non-blocking mode");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_inotify_init1, IN_NONBLOCK);
  if (fd == -1)
    {
      puts ("inotify_init1(IN_NONBLOCK) failed");
      return 1;
    }
  fl = fcntl (fd, F_GETFL);
  if (fl == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((fl & O_NONBLOCK) == 0)
    {
      puts ("inotify_init1(IN_NONBLOCK) set non-blocking mode");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:29 -07:00
Ulrich Drepper
be61a86d72 flag parameters: NONBLOCK in pipe
This patch adds O_NONBLOCK support to pipe2.  It is minimally more involved
than the patches for eventfd et.al but still trivial.  The interfaces of the
create_write_pipe and create_read_pipe helper functions were changed and the
one other caller as well.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_pipe2
# ifdef __x86_64__
#  define __NR_pipe2 293
# elif defined __i386__
#  define __NR_pipe2 331
# else
#  error "need __NR_pipe2"
# endif
#endif

int
main (void)
{
  int fds[2];
  if (syscall (__NR_pipe2, fds, 0) == -1)
    {
      puts ("pipe2(0) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      int fl = fcntl (fds[i], F_GETFL);
      if (fl == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if (fl & O_NONBLOCK)
        {
          printf ("pipe2(0) set non-blocking mode for fds[%d]\n", i);
          return 1;
        }
      close (fds[i]);
    }

  if (syscall (__NR_pipe2, fds, O_NONBLOCK) == -1)
    {
      puts ("pipe2(O_NONBLOCK) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      int fl = fcntl (fds[i], F_GETFL);
      if (fl == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if ((fl & O_NONBLOCK) == 0)
        {
          printf ("pipe2(O_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i);
          return 1;
        }
      close (fds[i]);
    }

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:29 -07:00
Ulrich Drepper
6b1ef0e60d flag parameters: NONBLOCK in timerfd_create
This patch adds support for the TFD_NONBLOCK flag to timerfd_create.  The
additional changes needed are minimal.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_timerfd_create
# ifdef __x86_64__
#  define __NR_timerfd_create 283
# elif defined __i386__
#  define __NR_timerfd_create 322
# else
#  error "need __NR_timerfd_create"
# endif
#endif

#define TFD_NONBLOCK O_NONBLOCK

int
main (void)
{
  int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0);
  if (fd == -1)
    {
      puts ("timerfd_create(0) failed");
      return 1;
    }
  int fl = fcntl (fd, F_GETFL);
  if (fl == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (fl & O_NONBLOCK)
    {
      puts ("timerfd_create(0) set non-blocking mode");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_NONBLOCK);
  if (fd == -1)
    {
      puts ("timerfd_create(TFD_NONBLOCK) failed");
      return 1;
    }
  fl = fcntl (fd, F_GETFL);
  if (fl == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((fl & O_NONBLOCK) == 0)
    {
      puts ("timerfd_create(TFD_NONBLOCK) set non-blocking mode");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:29 -07:00
Ulrich Drepper
e7d476dfdf flag parameters: NONBLOCK in eventfd
This patch adds support for the EFD_NONBLOCK flag to eventfd2.  The
additional changes needed are minimal.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_eventfd2
# ifdef __x86_64__
#  define __NR_eventfd2 290
# elif defined __i386__
#  define __NR_eventfd2 328
# else
#  error "need __NR_eventfd2"
# endif
#endif

#define EFD_NONBLOCK O_NONBLOCK

int
main (void)
{
  int fd = syscall (__NR_eventfd2, 1, 0);
  if (fd == -1)
    {
      puts ("eventfd2(0) failed");
      return 1;
    }
  int fl = fcntl (fd, F_GETFL);
  if (fl == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (fl & O_NONBLOCK)
    {
      puts ("eventfd2(0) sets non-blocking mode");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_eventfd2, 1, EFD_NONBLOCK);
  if (fd == -1)
    {
      puts ("eventfd2(EFD_NONBLOCK) failed");
      return 1;
    }
  fl = fcntl (fd, F_GETFL);
  if (fl == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((fl & O_NONBLOCK) == 0)
    {
      puts ("eventfd2(EFD_NONBLOCK) does not set non-blocking mode");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:29 -07:00
Ulrich Drepper
5fb5e04926 flag parameters: NONBLOCK in signalfd
This patch adds support for the SFD_NONBLOCK flag to signalfd4.  The
additional changes needed are minimal.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_signalfd4
# ifdef __x86_64__
#  define __NR_signalfd4 289
# elif defined __i386__
#  define __NR_signalfd4 327
# else
#  error "need __NR_signalfd4"
# endif
#endif

#define SFD_NONBLOCK O_NONBLOCK

int
main (void)
{
  sigset_t ss;
  sigemptyset (&ss);
  sigaddset (&ss, SIGUSR1);
  int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0);
  if (fd == -1)
    {
      puts ("signalfd4(0) failed");
      return 1;
    }
  int fl = fcntl (fd, F_GETFL);
  if (fl == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (fl & O_NONBLOCK)
    {
      puts ("signalfd4(0) set non-blocking mode");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_NONBLOCK);
  if (fd == -1)
    {
      puts ("signalfd4(SFD_NONBLOCK) failed");
      return 1;
    }
  fl = fcntl (fd, F_GETFL);
  if (fl == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((fl & O_NONBLOCK) == 0)
    {
      puts ("signalfd4(SFD_NONBLOCK) does not set non-blocking mode");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:29 -07:00
Ulrich Drepper
77d2720059 flag parameters: NONBLOCK in socket and socketpair
This patch introduces support for the SOCK_NONBLOCK flag in socket,
socketpair, and  paccept.  To do this the internal function sock_attach_fd
gets an additional parameter which it uses to set the appropriate flag for
the file descriptor.

Given that in modern, scalable programs almost all socket connections are
non-blocking and the minimal additional cost for the new functionality
I see no reason not to add this code.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
#include <netinet/in.h>
#include <sys/socket.h>
#include <sys/syscall.h>

#ifndef __NR_paccept
# ifdef __x86_64__
#  define __NR_paccept 288
# elif defined __i386__
#  define SYS_PACCEPT 18
#  define USE_SOCKETCALL 1
# else
#  error "need __NR_paccept"
# endif
#endif

#ifdef USE_SOCKETCALL
# define paccept(fd, addr, addrlen, mask, flags) \
  ({ long args[6] = { \
       (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \
     syscall (__NR_socketcall, SYS_PACCEPT, args); })
#else
# define paccept(fd, addr, addrlen, mask, flags) \
  syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags)
#endif

#define PORT 57392

#define SOCK_NONBLOCK O_NONBLOCK

static pthread_barrier_t b;

static void *
tf (void *arg)
{
  pthread_barrier_wait (&b);
  int s = socket (AF_INET, SOCK_STREAM, 0);
  struct sockaddr_in sin;
  sin.sin_family = AF_INET;
  sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
  sin.sin_port = htons (PORT);
  connect (s, (const struct sockaddr *) &sin, sizeof (sin));
  close (s);
  pthread_barrier_wait (&b);

  pthread_barrier_wait (&b);
  s = socket (AF_INET, SOCK_STREAM, 0);
  sin.sin_port = htons (PORT);
  connect (s, (const struct sockaddr *) &sin, sizeof (sin));
  close (s);
  pthread_barrier_wait (&b);

  return NULL;
}

int
main (void)
{
  int fd;
  fd = socket (PF_INET, SOCK_STREAM, 0);
  if (fd == -1)
    {
      puts ("socket(0) failed");
      return 1;
    }
  int fl = fcntl (fd, F_GETFL);
  if (fl == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (fl & O_NONBLOCK)
    {
      puts ("socket(0) set non-blocking mode");
      return 1;
    }
  close (fd);

  fd = socket (PF_INET, SOCK_STREAM|SOCK_NONBLOCK, 0);
  if (fd == -1)
    {
      puts ("socket(SOCK_NONBLOCK) failed");
      return 1;
    }
  fl = fcntl (fd, F_GETFL);
  if (fl == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((fl & O_NONBLOCK) == 0)
    {
      puts ("socket(SOCK_NONBLOCK) does not set non-blocking mode");
      return 1;
    }
  close (fd);

  int fds[2];
  if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1)
    {
      puts ("socketpair(0) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      fl = fcntl (fds[i], F_GETFL);
      if (fl == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if (fl & O_NONBLOCK)
        {
          printf ("socketpair(0) set non-blocking mode for fds[%d]\n", i);
          return 1;
        }
      close (fds[i]);
    }

  if (socketpair (PF_UNIX, SOCK_STREAM|SOCK_NONBLOCK, 0, fds) == -1)
    {
      puts ("socketpair(SOCK_NONBLOCK) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      fl = fcntl (fds[i], F_GETFL);
      if (fl == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if ((fl & O_NONBLOCK) == 0)
        {
          printf ("socketpair(SOCK_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i);
          return 1;
        }
      close (fds[i]);
    }

  pthread_barrier_init (&b, NULL, 2);

  struct sockaddr_in sin;
  pthread_t th;
  if (pthread_create (&th, NULL, tf, NULL) != 0)
    {
      puts ("pthread_create failed");
      return 1;
    }

  int s = socket (AF_INET, SOCK_STREAM, 0);
  int reuse = 1;
  setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse));
  sin.sin_family = AF_INET;
  sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
  sin.sin_port = htons (PORT);
  bind (s, (struct sockaddr *) &sin, sizeof (sin));
  listen (s, SOMAXCONN);

  pthread_barrier_wait (&b);

  int s2 = paccept (s, NULL, 0, NULL, 0);
  if (s2 < 0)
    {
      puts ("paccept(0) failed");
      return 1;
    }

  fl = fcntl (s2, F_GETFL);
  if (fl & O_NONBLOCK)
    {
      puts ("paccept(0) set non-blocking mode");
      return 1;
    }
  close (s2);
  close (s);

  pthread_barrier_wait (&b);

  s = socket (AF_INET, SOCK_STREAM, 0);
  sin.sin_port = htons (PORT);
  setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse));
  bind (s, (struct sockaddr *) &sin, sizeof (sin));
  listen (s, SOMAXCONN);

  pthread_barrier_wait (&b);

  s2 = paccept (s, NULL, 0, NULL, SOCK_NONBLOCK);
  if (s2 < 0)
    {
      puts ("paccept(SOCK_NONBLOCK) failed");
      return 1;
    }

  fl = fcntl (s2, F_GETFL);
  if ((fl & O_NONBLOCK) == 0)
    {
      puts ("paccept(SOCK_NONBLOCK) does not set non-blocking mode");
      return 1;
    }
  close (s2);
  close (s);

  pthread_barrier_wait (&b);
  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:29 -07:00
Ulrich Drepper
4006553b06 flag parameters: inotify_init
This patch introduces the new syscall inotify_init1 (note: the 1 stands for
the one parameter the syscall takes, as opposed to no parameter before).  The
values accepted for this parameter are function-specific and defined in the
inotify.h header.  Here the values must match the O_* flags, though.  In this
patch CLOEXEC support is introduced.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_inotify_init1
# ifdef __x86_64__
#  define __NR_inotify_init1 294
# elif defined __i386__
#  define __NR_inotify_init1 332
# else
#  error "need __NR_inotify_init1"
# endif
#endif

#define IN_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd;
  fd = syscall (__NR_inotify_init1, 0);
  if (fd == -1)
    {
      puts ("inotify_init1(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("inotify_init1(0) set close-on-exit");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_inotify_init1, IN_CLOEXEC);
  if (fd == -1)
    {
      puts ("inotify_init1(IN_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("inotify_init1(O_CLOEXEC) does not set close-on-exit");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:28 -07:00
Ulrich Drepper
ed8cae8ba0 flag parameters: pipe
This patch introduces the new syscall pipe2 which is like pipe but it also
takes an additional parameter which takes a flag value.  This patch implements
the handling of O_CLOEXEC for the flag.  I did not add support for the new
syscall for the architectures which have a special sys_pipe implementation.  I
think the maintainers of those archs have the chance to go with the unified
implementation but that's up to them.

The implementation introduces do_pipe_flags.  I did that instead of changing
all callers of do_pipe because some of the callers are written in assembler.
I would probably screw up changing the assembly code.  To avoid breaking code
do_pipe is now a small wrapper around do_pipe_flags.  Once all callers are
changed over to do_pipe_flags the old do_pipe function can be removed.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_pipe2
# ifdef __x86_64__
#  define __NR_pipe2 293
# elif defined __i386__
#  define __NR_pipe2 331
# else
#  error "need __NR_pipe2"
# endif
#endif

int
main (void)
{
  int fd[2];
  if (syscall (__NR_pipe2, fd, 0) != 0)
    {
      puts ("pipe2(0) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      int coe = fcntl (fd[i], F_GETFD);
      if (coe == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if (coe & FD_CLOEXEC)
        {
          printf ("pipe2(0) set close-on-exit for fd[%d]\n", i);
          return 1;
        }
    }
  close (fd[0]);
  close (fd[1]);

  if (syscall (__NR_pipe2, fd, O_CLOEXEC) != 0)
    {
      puts ("pipe2(O_CLOEXEC) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      int coe = fcntl (fd[i], F_GETFD);
      if (coe == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if ((coe & FD_CLOEXEC) == 0)
        {
          printf ("pipe2(O_CLOEXEC) does not set close-on-exit for fd[%d]\n", i);
          return 1;
        }
    }
  close (fd[0]);
  close (fd[1]);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:28 -07:00
Ulrich Drepper
336dd1f70f flag parameters: dup2
This patch adds the new dup3 syscall.  It extends the old dup2 syscall by one
parameter which is meant to hold a flag value.  Support for the O_CLOEXEC flag
is added in this patch.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_dup3
# ifdef __x86_64__
#  define __NR_dup3 292
# elif defined __i386__
#  define __NR_dup3 330
# else
#  error "need __NR_dup3"
# endif
#endif

int
main (void)
{
  int fd = syscall (__NR_dup3, 1, 4, 0);
  if (fd == -1)
    {
      puts ("dup3(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("dup3(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_dup3, 1, 4, O_CLOEXEC);
  if (fd == -1)
    {
      puts ("dup3(O_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("dup3(O_CLOEXEC) set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:28 -07:00
Ulrich Drepper
a0998b50c3 flag parameters: epoll_create
This patch adds the new epoll_create2 syscall.  It extends the old epoll_create
syscall by one parameter which is meant to hold a flag value.  In this
patch the only flag support is EPOLL_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.

A new name EPOLL_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_epoll_create2
# ifdef __x86_64__
#  define __NR_epoll_create2 291
# elif defined __i386__
#  define __NR_epoll_create2 329
# else
#  error "need __NR_epoll_create2"
# endif
#endif

#define EPOLL_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd = syscall (__NR_epoll_create2, 1, 0);
  if (fd == -1)
    {
      puts ("epoll_create2(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("epoll_create2(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_epoll_create2, 1, EPOLL_CLOEXEC);
  if (fd == -1)
    {
      puts ("epoll_create2(EPOLL_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:28 -07:00
Ulrich Drepper
11fcb6c146 flag parameters: timerfd_create
The timerfd_create syscall already has a flags parameter.  It just is
unused so far.  This patch changes this by introducing the TFD_CLOEXEC
flag to set the close-on-exec flag for the returned file descriptor.

A new name TFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_timerfd_create
# ifdef __x86_64__
#  define __NR_timerfd_create 283
# elif defined __i386__
#  define __NR_timerfd_create 322
# else
#  error "need __NR_timerfd_create"
# endif
#endif

#define TFD_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, 0);
  if (fd == -1)
    {
      puts ("timerfd_create(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("timerfd_create(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_timerfd_create, CLOCK_REALTIME, TFD_CLOEXEC);
  if (fd == -1)
    {
      puts ("timerfd_create(TFD_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("timerfd_create(TFD_CLOEXEC) set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Ulrich Drepper
b087498eb5 flag parameters: eventfd
This patch adds the new eventfd2 syscall.  It extends the old eventfd
syscall by one parameter which is meant to hold a flag value.  In this
patch the only flag support is EFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.

A new name EFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_eventfd2
# ifdef __x86_64__
#  define __NR_eventfd2 290
# elif defined __i386__
#  define __NR_eventfd2 328
# else
#  error "need __NR_eventfd2"
# endif
#endif

#define EFD_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd = syscall (__NR_eventfd2, 1, 0);
  if (fd == -1)
    {
      puts ("eventfd2(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("eventfd2(0) sets close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_eventfd2, 1, EFD_CLOEXEC);
  if (fd == -1)
    {
      puts ("eventfd2(EFD_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("eventfd2(EFD_CLOEXEC) does not set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Ulrich Drepper
9deb27baed flag parameters: signalfd
This patch adds the new signalfd4 syscall.  It extends the old signalfd
syscall by one parameter which is meant to hold a flag value.  In this
patch the only flag support is SFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.

A new name SFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>

#ifndef __NR_signalfd4
# ifdef __x86_64__
#  define __NR_signalfd4 289
# elif defined __i386__
#  define __NR_signalfd4 327
# else
#  error "need __NR_signalfd4"
# endif
#endif

#define SFD_CLOEXEC O_CLOEXEC

int
main (void)
{
  sigset_t ss;
  sigemptyset (&ss);
  sigaddset (&ss, SIGUSR1);
  int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0);
  if (fd == -1)
    {
      puts ("signalfd4(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("signalfd4(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC);
  if (fd == -1)
    {
      puts ("signalfd4(SFD_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag");
      return 1;
    }
  close (fd);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Ulrich Drepper
7d9dbca342 flag parameters: anon_inode_getfd extension
This patch just extends the anon_inode_getfd interface to take an additional
parameter with a flag value.  The flag value is passed on to
get_unused_fd_flags in anticipation for a use with the O_CLOEXEC flag.

No actual semantic changes here, the changed callers all pass 0 for now.

[akpm@linux-foundation.org: KVM fix]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Ulrich Drepper
c019bbc612 flag parameters: paccept w/out set_restore_sigmask
Some platforms do not have support to restore the signal mask in the
return path from a syscall.  For those platforms syscalls like pselect are
not defined at all.  This is, I think, not a good choice for paccept()
since paccept() adds more value on top of accept() than just the signal
mask handling.

Therefore this patch defines a scaled down version of the sys_paccept
function for those platforms.  It returns -EINVAL in case the signal mask
is non-NULL but behaves the same otherwise.

Note that I explicitly included <linux/thread_info.h>.  I saw that it is
currently included but indirectly two levels down.  There is too much risk
in relying on this.  The header might change and then suddenly the
function definition would change without anyone immediately noticing.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Ulrich Drepper
aaca0bdca5 flag parameters: paccept
This patch is by far the most complex in the series.  It adds a new syscall
paccept.  This syscall differs from accept in that it adds (at the userlevel)
two additional parameters:

- a signal mask
- a flags value

The flags parameter can be used to set flag like SOCK_CLOEXEC.  This is
imlpemented here as well.  Some people argued that this is a property which
should be inherited from the file desriptor for the server but this is against
POSIX.  Additionally, we really want the signal mask parameter as well
(similar to pselect, ppoll, etc).  So an interface change in inevitable.

The flag value is the same as for socket and socketpair.  I think diverging
here will only create confusion.  Similar to the filesystem interfaces where
the use of the O_* constants differs, it is acceptable here.

The signal mask is handled as for pselect etc.  The mask is temporarily
installed for the thread and removed before the call returns.  I modeled the
code after pselect.  If there is a problem it's likely also in pselect.

For architectures which use socketcall I maintained this interface instead of
adding a system call.  The symmetry shouldn't be broken.

The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <errno.h>
#include <fcntl.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <netinet/in.h>
#include <sys/socket.h>
#include <sys/syscall.h>

#ifndef __NR_paccept
# ifdef __x86_64__
#  define __NR_paccept 288
# elif defined __i386__
#  define SYS_PACCEPT 18
#  define USE_SOCKETCALL 1
# else
#  error "need __NR_paccept"
# endif
#endif

#ifdef USE_SOCKETCALL
# define paccept(fd, addr, addrlen, mask, flags) \
  ({ long args[6] = { \
       (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \
     syscall (__NR_socketcall, SYS_PACCEPT, args); })
#else
# define paccept(fd, addr, addrlen, mask, flags) \
  syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags)
#endif

#define PORT 57392

#define SOCK_CLOEXEC O_CLOEXEC

static pthread_barrier_t b;

static void *
tf (void *arg)
{
  pthread_barrier_wait (&b);
  int s = socket (AF_INET, SOCK_STREAM, 0);
  struct sockaddr_in sin;
  sin.sin_family = AF_INET;
  sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
  sin.sin_port = htons (PORT);
  connect (s, (const struct sockaddr *) &sin, sizeof (sin));
  close (s);

  pthread_barrier_wait (&b);
  s = socket (AF_INET, SOCK_STREAM, 0);
  sin.sin_port = htons (PORT);
  connect (s, (const struct sockaddr *) &sin, sizeof (sin));
  close (s);
  pthread_barrier_wait (&b);

  pthread_barrier_wait (&b);
  sleep (2);
  pthread_kill ((pthread_t) arg, SIGUSR1);

  return NULL;
}

static void
handler (int s)
{
}

int
main (void)
{
  pthread_barrier_init (&b, NULL, 2);

  struct sockaddr_in sin;
  pthread_t th;
  if (pthread_create (&th, NULL, tf, (void *) pthread_self ()) != 0)
    {
      puts ("pthread_create failed");
      return 1;
    }

  int s = socket (AF_INET, SOCK_STREAM, 0);
  int reuse = 1;
  setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse));
  sin.sin_family = AF_INET;
  sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
  sin.sin_port = htons (PORT);
  bind (s, (struct sockaddr *) &sin, sizeof (sin));
  listen (s, SOMAXCONN);

  pthread_barrier_wait (&b);

  int s2 = paccept (s, NULL, 0, NULL, 0);
  if (s2 < 0)
    {
      puts ("paccept(0) failed");
      return 1;
    }

  int coe = fcntl (s2, F_GETFD);
  if (coe & FD_CLOEXEC)
    {
      puts ("paccept(0) set close-on-exec-flag");
      return 1;
    }
  close (s2);

  pthread_barrier_wait (&b);

  s2 = paccept (s, NULL, 0, NULL, SOCK_CLOEXEC);
  if (s2 < 0)
    {
      puts ("paccept(SOCK_CLOEXEC) failed");
      return 1;
    }

  coe = fcntl (s2, F_GETFD);
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("paccept(SOCK_CLOEXEC) does not set close-on-exec flag");
      return 1;
    }
  close (s2);

  pthread_barrier_wait (&b);

  struct sigaction sa;
  sa.sa_handler = handler;
  sa.sa_flags = 0;
  sigemptyset (&sa.sa_mask);
  sigaction (SIGUSR1, &sa, NULL);

  sigset_t ss;
  pthread_sigmask (SIG_SETMASK, NULL, &ss);
  sigaddset (&ss, SIGUSR1);
  pthread_sigmask (SIG_SETMASK, &ss, NULL);

  sigdelset (&ss, SIGUSR1);
  alarm (4);
  pthread_barrier_wait (&b);

  errno = 0 ;
  s2 = paccept (s, NULL, 0, &ss, 0);
  if (s2 != -1 || errno != EINTR)
    {
      puts ("paccept did not fail with EINTR");
      return 1;
    }

  close (s);

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

[akpm@linux-foundation.org: make it compile]
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Roland McGrath <roland@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Ulrich Drepper
a677a039be flag parameters: socket and socketpair
This patch adds support for flag values which are ORed to the type passwd
to socket and socketpair.  The additional code is minimal.  The flag
values in this implementation can and must match the O_* flags.  This
avoids overhead in the conversion.

The internal functions sock_alloc_fd and sock_map_fd get a new parameters
and all callers are changed.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <netinet/in.h>
#include <sys/socket.h>

#define PORT 57392

/* For Linux these must be the same.  */
#define SOCK_CLOEXEC O_CLOEXEC

int
main (void)
{
  int fd;
  fd = socket (PF_INET, SOCK_STREAM, 0);
  if (fd == -1)
    {
      puts ("socket(0) failed");
      return 1;
    }
  int coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if (coe & FD_CLOEXEC)
    {
      puts ("socket(0) set close-on-exec flag");
      return 1;
    }
  close (fd);

  fd = socket (PF_INET, SOCK_STREAM|SOCK_CLOEXEC, 0);
  if (fd == -1)
    {
      puts ("socket(SOCK_CLOEXEC) failed");
      return 1;
    }
  coe = fcntl (fd, F_GETFD);
  if (coe == -1)
    {
      puts ("fcntl failed");
      return 1;
    }
  if ((coe & FD_CLOEXEC) == 0)
    {
      puts ("socket(SOCK_CLOEXEC) does not set close-on-exec flag");
      return 1;
    }
  close (fd);

  int fds[2];
  if (socketpair (PF_UNIX, SOCK_STREAM, 0, fds) == -1)
    {
      puts ("socketpair(0) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      coe = fcntl (fds[i], F_GETFD);
      if (coe == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if (coe & FD_CLOEXEC)
        {
          printf ("socketpair(0) set close-on-exec flag for fds[%d]\n", i);
          return 1;
        }
      close (fds[i]);
    }

  if (socketpair (PF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, fds) == -1)
    {
      puts ("socketpair(SOCK_CLOEXEC) failed");
      return 1;
    }
  for (int i = 0; i < 2; ++i)
    {
      coe = fcntl (fds[i], F_GETFD);
      if (coe == -1)
        {
          puts ("fcntl failed");
          return 1;
        }
      if ((coe & FD_CLOEXEC) == 0)
        {
          printf ("socketpair(SOCK_CLOEXEC) does not set close-on-exec flag for fds[%d]\n", i);
          return 1;
        }
      close (fds[i]);
    }

  puts ("OK");

  return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:27 -07:00
Adrian Bunk
f606ddf42f remove the v850 port
Trying to compile the v850 port brings many compile errors, one of them exists
since at least kernel 2.6.19.

There also seems to be noone willing to bring this port back into a usable
state.

This patch therefore removes the v850 port.

If anyone ever decides to revive the v850 port the code will still be
available from older kernels, and it wouldn't be impossible for the port to
reenter the kernel if it would become actively maintained again.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:24 -07:00
WANG Cong
99764fa4ce UML: make several more things static
- Make some variables and functions static, since they don't need to be
  global.

- Remove an unused function - arch/um/kernel/time.c::sched_clock().

- Clean the style a bit as complained by checkpatch.pl.

Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:24 -07:00
WANG Cong
4a56758204 arch/um/kernel/mem.c: remove arch_validate()
- Remove arch_validate(), because no one uses it.

- Remove useless macro HAVE_ARCH_VALIDATE.

- Make the variable 'empty_bad_page' static.

Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: WANG Cong <wangcong@zeuux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:24 -07:00
Fernando Luis Vazquez Cao
d50004b086 cris: remove unused global_flush_tlb
global_flush_tlb is declared but never used.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Cc: Mikael Starvik <starvik@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:24 -07:00
Adrian Bunk
9120195721 mn10300: move sg_dma_{address,len}() to asm/scatterlist.h
mn10300 was the only architecture where sg_dma_{address,len}() were not
in asm/scatterlist.h, and it's not a big surprise that this caused a
compile error somewhere:

/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/media/video/videobuf-dma-sg.c: In function `videobuf_dma_map':
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/media/video/videobuf-dma-sg.c:238: error: implicit declaration of function 'sg_dma_address'

Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:24 -07:00
Shaohua Li
bdfe6b7c68 pm: acpi hibernation: utilize hardware signature
ACPI defines a hardware signature.  BIOS calculates the signature according to
hardware configure and if hardware changes while hibernated, the signature
will change.  In that case, S4 resume should fail.

Still, there may be systems on which this mechanism does not work correctly,
so it is better to provide a workaround for them.  For this reason, add a new
switch to the acpi_sleep= command line argument allowing one to disable
hardware signature checking.

[shaohua.li@intel.com: build fix]
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <Valdis.Kletnieks@vt.edu>
Cc: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:24 -07:00
Zhang Rui
c1a220e7ac pm: introduce new interfaces schedule_work_on() and queue_work_on()
This interface allows adding a job on a specific cpu.

Although a work struct on a cpu will be scheduled to other cpu if the cpu
dies, there is a recursion if a work task tries to offline the cpu it's
running on.  we need to schedule the task to a specific cpu in this case.
http://bugzilla.kernel.org/show_bug.cgi?id=10897

[oleg@tv-sign.ru: cleanups]
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Rus <harbour@sfinx.od.ua>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:23 -07:00
Alan Stern
8111d1b552 pm: add new PM_EVENT codes for runtime power transitions
This patch (as1112) adds some new PM_EVENT_* codes for use by kernel
subsystems.  They describe runtime power-state transitions of the sort already
implemented by the USB subsystem.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:23 -07:00
Rafael J. Wysocki
8c363265d5 pm: drop unnecessary includes from pm.h
Drop unnecessary includes from include/linux/pm.h .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:23 -07:00
Rafael J. Wysocki
e7ecb331e1 pm: remove remaining obsolete definitions from pm.h
Remove the remaining obsolete definitions from include/linux/pm.h and move
the definitions of PM_SUSPEND and PM_RESUME to the header of h3600 which
is the only user of them.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:22 -07:00
Rafael J. Wysocki
558481f038 pm: remove definition of struct pm_dev
Remove the definition of 'struct pm_dev', which is not used any more,
along with some related stuff from include/linux/pm.h .

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:22 -07:00
Adrian Bunk
d75f65fd24 remove include/linux/pm_legacy.h
Remove the obsolete and no longer used include/linux/pm_legacy.h

Reviewed-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: Pavel Machek <pavel@suse.cz>
Acked-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:22 -07:00
Adrian Bunk
e53f12cc6c remove include/asm-h8300/keyboard.h
This patch removes the unused include/asm-h8300/keyboard.h

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:22 -07:00
Hugh Dickins
9b3e43a747 security: remove unused forwards
Why would linux/security.h need forward declarations for nfsctl_arg and
swap_info_struct?  It's hard to imagine: remove them.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:22 -07:00
Andrew G. Morgan
5459c164f0 security: protect legacy applications from executing with insufficient privilege
When cap_bset suppresses some of the forced (fP) capabilities of a file,
it is generally only safe to execute the program if it understands how to
recognize it doesn't have enough privilege to work correctly.  For legacy
applications (fE!=0), which have no non-destructive way to determine that
they are missing privilege, we fail to execute (EPERM) any executable that
requires fP capabilities, but would otherwise get pP' < fP.  This is a
fail-safe permission check.

For some discussion of why it is problematic for (legacy) privileged
applications to run with less than the set of capabilities requested for
them, see:

 http://userweb.kernel.org/~morgan/sendmail-capabilities-war-story.html

With this iteration of this support, we do not include setuid-0 based
privilege protection from the bounding set.  That is, the admin can still
(ab)use the bounding set to suppress the privileges of a setuid-0 program.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Andrew G. Morgan <morgan@kernel.org>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:22 -07:00
Gerald Schaefer
83d1674a94 mm: make CONFIG_MIGRATION available w/o CONFIG_NUMA
We'd like to support CONFIG_MEMORY_HOTREMOVE on s390, which depends on
CONFIG_MIGRATION.  So far, CONFIG_MIGRATION is only available with NUMA
support.

This patch makes CONFIG_MIGRATION selectable for architectures that define
ARCH_ENABLE_MEMORY_HOTREMOVE.  When MIGRATION is enabled w/o NUMA, the
kernel won't compile because migrate_vmas() does not know about
vm_ops->migrate() and vma_migratable() does not know about policy_zone.
To fix this, those two functions can be restricted to '#ifdef CONFIG_NUMA'
because they are not being used w/o NUMA.  vma_migratable() is moved over
from migrate.h to mempolicy.h.

[kosaki.motohiro@jp.fujitsu.com: build fix]
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: KOSAKI Motorhiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:21 -07:00
Milton Miller
9ca908f47b kcalloc: remove runtime division
While in all cases in the kernel we know the size of the elements to be
created, we don't always know the count of elements.  By commuting the size
and count in the overflow check, the compiler can reduce the runtime division
of size_t with a compare to a (unique) constant in these cases.

Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:21 -07:00
Badari Pulavarty
5c755e9fd8 memory-hotplug: add sysfs removable attribute for hotplug memory remove
Memory may be hot-removed on a per-memory-block basis, particularly on
POWER where the SPARSEMEM section size often matches the memory-block
size.  A user-level agent must be able to identify which sections of
memory are likely to be removable before attempting the potentially
expensive operation.  This patch adds a file called "removable" to the
memory directory in sysfs to help such an agent.  In this patch, a memory
block is considered removable if;

o It contains only MOVABLE pageblocks
o It contains only pageblocks with free pages regardless of pageblock type

On the other hand, a memory block starting with a PageReserved() page will
never be considered removable.  Without this patch, the user-agent is
forced to choose a memory block to remove randomly.

Sample output of the sysfs files:

./memory/memory0/removable: 0
./memory/memory1/removable: 0
./memory/memory2/removable: 0
./memory/memory3/removable: 0
./memory/memory4/removable: 0
./memory/memory5/removable: 0
./memory/memory6/removable: 0
./memory/memory7/removable: 1
./memory/memory8/removable: 0
./memory/memory9/removable: 0
./memory/memory10/removable: 0
./memory/memory11/removable: 0
./memory/memory12/removable: 0
./memory/memory13/removable: 0
./memory/memory14/removable: 0
./memory/memory15/removable: 0
./memory/memory16/removable: 0
./memory/memory17/removable: 1
./memory/memory18/removable: 1
./memory/memory19/removable: 1
./memory/memory20/removable: 1
./memory/memory21/removable: 1
./memory/memory22/removable: 1

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:21 -07:00
Yasunori Goto
af370fb8cb memory hotplug: small fixes to bootmem freeing for memory hotremove
- Change some naming
  * Magic -> types
  * MIX_INFO -> MIX_SECTION_INFO
  * Change definition of bootmem type from direct hex value

- __free_pages_bootmem() becomes __meminit.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:21 -07:00
Andrea Righi
27ac792ca0 PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures
On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
boundary. For example:

	u64 val = PAGE_ALIGN(size);

always returns a value < 4GB even if size is greater than 4GB.

The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
example):

#define PAGE_SHIFT      12
#define PAGE_SIZE       (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK       (~(PAGE_SIZE-1))
...
#define PAGE_ALIGN(addr)       (((addr)+PAGE_SIZE-1)&PAGE_MASK)

The "~" is performed on a 32-bit value, so everything in "and" with
PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
Using the ALIGN() macro seems to be the right way, because it uses
typeof(addr) for the mask.

Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
include/linux/mm.h.

See also lkml discussion: http://lkml.org/lkml/2008/6/11/237

[akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
[akpm@linux-foundation.org: fix v850]
[akpm@linux-foundation.org: fix powerpc]
[akpm@linux-foundation.org: fix arm]
[akpm@linux-foundation.org: fix mips]
[akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
[akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
[akpm@linux-foundation.org: fix powerpc]
Signed-off-by: Andrea Righi <righi.andrea@gmail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:21 -07:00
Timur Tabi
2be0ffe2b2 mm: add alloc_pages_exact() and free_pages_exact()
alloc_pages_exact() is similar to alloc_pages(), except that it allocates
the minimum number of pages to fulfill the request.  This is useful if you
want to allocate a very large buffer that is slightly larger than an even
power-of-two number of pages.  In that case, alloc_pages() will waste a
lot of memory.

I have a video driver that wants to allocate a 5MB buffer.  alloc_pages()
wiill waste 3MB of physically-contiguous memory.

Signed-off-by: Timur Tabi <timur@freescale.com>
Cc: Andi Kleen <andi@firstfloor.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:20 -07:00
Johannes Weiner
3560e249ab bootmem: replace node_boot_start in struct bootmem_data
Almost all users of this field need a PFN instead of a physical address,
so replace node_boot_start with node_min_pfn.

[Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()]
Signed-off-by: Johannes Weiner <hannes@saeureba.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:20 -07:00
Johannes Weiner
5f2809e69c bootmem: clean up alloc_bootmem_core
alloc_bootmem_core has become quite nasty to read over time.  This is a
clean rewrite that keeps the semantics.

bdata->last_pos has been dropped.

bdata->last_success has been renamed to hint_idx and it is now an index
relative to the node's range.  Since further block searching might start
at this index, it is now set to the end of a succeeded allocation rather
than its beginning.

bdata->last_offset has been renamed to last_end_off to be more clear that
it represents the ending address of the last allocation relative to the
node.

[y-goto@jp.fujitsu.com: fix new alloc_bootmem_core()]
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:20 -07:00
Johannes Weiner
223e8dc924 bootmem: reorder code to match new bootmem structure
This only reorders functions so that further patches will be easier to
read.  No code changed.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:19 -07:00
Jon Tollefson
0d9ea75443 powerpc: support multiple hugepage sizes
Instead of using the variable mmu_huge_psize to keep track of the huge
page size we use an array of MMU_PAGE_* values.  For each supported huge
page size we need to know the hugepte_shift value and have a
pgtable_cache.  The hstate or an mmu_huge_psizes index is passed to
functions so that they know which huge page size they should use.

The hugepage sizes 16M and 64K are setup(if available on the hardware) so
that they don't have to be set on the boot cmd line in order to use them.
The number of 16G pages have to be specified at boot-time though (e.g.
hugepagesz=16G hugepages=5).

Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:19 -07:00
Jon Tollefson
91224346aa powerpc: define support for 16G hugepages
The huge page size is defined for 16G pages.  If a hugepagesz of 16G is
specified at boot-time then it becomes the huge page size instead of the
default 16M.

The change in pgtable-64K.h is to the macro pte_iterate_hashed_subpages to
make the increment to va (the 1 being shifted) be a long so that it is not
shifted to 0.  Otherwise it would create an infinite loop when the shift
value is for a 16G page (when base page size is 64K).

Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:19 -07:00
Jon Tollefson
658013e93e powerpc: scan device tree for gigantic pages
The 16G huge pages have to be reserved in the HMC prior to boot.  The
location of the pages are placed in the device tree.  This patch adds code
to scan the device tree during very early boot and save these page
locations until hugetlbfs is ready for them.

Acked-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:19 -07:00
Jon Tollefson
53ba51d21d hugetlb: allow arch overridden hugepage allocation
Allow alloc_bootmem_huge_page() to be overridden by architectures that
can't always use bootmem.  This requires huge_boot_pages to be available
for use by this function.

This is required for powerpc 16G pages, which have to be reserved prior to
boot-time.  The location of these pages are indicated in the device tree.

Acked-by: Adam Litke <agl@us.ibm.com>
Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:19 -07:00
Andi Kleen
b4718e628d x86: add hugepagesz option on 64-bit
Add an hugepagesz=...  option similar to IA64, PPC etc.  to x86-64.

This finally allows to select GB pages for hugetlbfs in x86 now that all
the infrastructure is in place.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:19 -07:00
Andi Kleen
ceb8687961 hugetlb: introduce pud_huge
Straight forward extensions for huge pages located in the PUD instead of
PMDs.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:18 -07:00
Andi Kleen
b54bbf7b81 mm: introduce non panic alloc_bootmem
Straight forward variant of the existing __alloc_bootmem_node, only
subsequent patch when allocating giant hugepages at boot -- don't want to
panic if we can't allocate as many as the user asked for.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:17 -07:00
Nishanth Aravamudan
a343787016 hugetlb: new sysfs interface
Provide new hugepages user APIs that are more suited to multiple hstates
in sysfs.  There is a new directory, /sys/kernel/hugepages.  Underneath
that directory there will be a directory per-supported hugepage size,
e.g.:

/sys/kernel/hugepages/hugepages-64kB
/sys/kernel/hugepages/hugepages-16384kB
/sys/kernel/hugepages/hugepages-16777216kB

corresponding to 64k, 16m and 16g respectively.  Within each
hugepages-size directory there are a number of files, corresponding to the
tracked counters in the hstate, e.g.:

/sys/kernel/hugepages/hugepages-64/nr_hugepages
/sys/kernel/hugepages/hugepages-64/nr_overcommit_hugepages
/sys/kernel/hugepages/hugepages-64/free_hugepages
/sys/kernel/hugepages/hugepages-64/resv_hugepages
/sys/kernel/hugepages/hugepages-64/surplus_hugepages

Of these files, the first two are read-write and the latter three are
read-only.  The size of the hugepage being manipulated is trivially
deducible from the enclosing directory and is always expressed in kB (to
match meminfo).

[dave@linux.vnet.ibm.com: fix build]
[nacc@us.ibm.com: hugetlb: hang off of /sys/kernel/mm rather than /sys/kernel]
[nacc@us.ibm.com: hugetlb: remove CONFIG_SYSFS dependency]
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:17 -07:00
Andi Kleen
a137e1cc6d hugetlbfs: per mount huge page sizes
Add the ability to configure the hugetlb hstate used on a per mount basis.

- Add a new pagesize= option to the hugetlbfs mount that allows setting
  the page size
- This option causes the mount code to find the hstate corresponding to the
  specified size, and sets up a pointer to the hstate in the mount's
  superblock.
- Change the hstate accessors to use this information rather than the
  global_hstate they were using (requires a slight change in mm/memory.c
  so we don't NULL deref in the error-unmap path -- see comments).

[np: take hstate out of hugetlbfs inode and vma->vm_private_data]

Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:17 -07:00
Andi Kleen
e5ff215941 hugetlb: multiple hstates for multiple page sizes
Add basic support for more than one hstate in hugetlbfs.  This is the key
to supporting multiple hugetlbfs page sizes at once.

- Rather than a single hstate, we now have an array, with an iterator
- default_hstate continues to be the struct hstate which we use by default
- Add functions for architectures to register new hstates

[akpm@linux-foundation.org: coding-style fixes]
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:17 -07:00
Andi Kleen
a551643895 hugetlb: modular state for hugetlb page size
The goal of this patchset is to support multiple hugetlb page sizes.  This
is achieved by introducing a new struct hstate structure, which
encapsulates the important hugetlb state and constants (eg.  huge page
size, number of huge pages currently allocated, etc).

The hstate structure is then passed around the code which requires these
fields, they will do the right thing regardless of the exact hstate they
are operating on.

This patch adds the hstate structure, with a single global instance of it
(default_hstate), and does the basic work of converting hugetlb to use the
hstate.

Future patches will add more hstate structures to allow for different
hugetlbfs mounts to have different page sizes.

[akpm@linux-foundation.org: coding-style fixes]
Acked-by: Adam Litke <agl@us.ibm.com>
Acked-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:17 -07:00
Nishanth Aravamudan
ff7ea79cf7 mm: create /sys/kernel/mm
Add a kobject to create /sys/kernel/mm when sysfs is mounted.  The kobject
will exist regardless.  This will allow for the hugepage related sysfs
directories to exist under the mm "subsystem" directory.  Add an ABI file
appropriately.

[kosaki.motohiro@jp.fujitsu.com: fix build]
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:17 -07:00
Andy Whitcroft
cdfd4325c0 mm: record MAP_NORESERVE status on vmas and fix small page mprotect reservations
With Mel's hugetlb private reservation support patches applied, strict
overcommit semantics are applied to both shared and private huge page
mappings.  This can be a problem if an application relied on unlimited
overcommit semantics for private mappings.  An example of this would be an
application which maps a huge area with the intention of using it very
sparsely.  These application would benefit from being able to opt-out of
the strict overcommit.  It should be noted that prior to hugetlb
supporting demand faulting all mappings were fully populated and so
applications of this type should be rare.

This patch stack implements the MAP_NORESERVE mmap() flag for huge page
mappings.  This flag has the same meaning as for small page mappings,
suppressing reservations for that mapping.

Thanks to Mel Gorman for reviewing a number of early versions of these
patches.

This patch:

When a small page mapping is created with mmap() reservations are created
by default for any memory pages required.  When the region is read/write
the reservation is increased for every page, no reservation is needed for
read-only regions (as they implicitly share the zero page).  Reservations
are tracked via the VM_ACCOUNT vma flag which is present when the region
has reservation backing it.  When we convert a region from read-only to
read-write new reservations are aquired and VM_ACCOUNT is set.  However,
when a read-only map is created with MAP_NORESERVE it is indistinguishable
from a normal mapping.  When we then convert that to read/write we are
forced to incorrectly create reservations for it as we have no record of
the original MAP_NORESERVE.

This patch introduces a new vma flag VM_NORESERVE which records the
presence of the original MAP_NORESERVE flag.  This allows us to
distinguish these two circumstances and correctly account the reserve.

As well as fixing this FIXME in the code, this makes it much easier to
introduce MAP_NORESERVE support for huge pages as this flag is available
consistantly for the life of the mapping.  VM_ACCOUNT on the other hand is
heavily used at the generic level in association with small pages.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Adam Litke <agl@us.ibm.com>
Cc: Johannes Weiner <hannes@saeurebad.de>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:16 -07:00
Mel Gorman
04f2cbe356 hugetlb: guarantee that COW faults for a process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed
After patch 2 in this series, a process that successfully calls mmap() for
a MAP_PRIVATE mapping will be guaranteed to successfully fault until a
process calls fork().  At that point, the next write fault from the parent
could fail due to COW if the child still has a reference.

We only reserve pages for the parent but a copy must be made to avoid
leaking data from the parent to the child after fork().  Reserves could be
taken for both parent and child at fork time to guarantee faults but if
the mapping is large it is highly likely we will not have sufficient pages
for the reservation, and it is common to fork only to exec() immediatly
after.  A failure here would be very undesirable.

Note that the current behaviour of mainline with MAP_PRIVATE pages is
pretty bad.  The following situation is allowed to occur today.

1. Process calls mmap(MAP_PRIVATE)
2. Process calls mlock() to fault all pages and makes sure it succeeds
3. Process forks()
4. Process writes to MAP_PRIVATE mapping while child still exists
5. If the COW fails at this point, the process gets SIGKILLed even though it
   had taken care to ensure the pages existed

This patch improves the situation by guaranteeing the reliability of the
process that successfully calls mmap().  When the parent performs COW, it
will try to satisfy the allocation without using reserves.  If that fails
the parent will steal the page leaving any children without a page.
Faults from the child after that point will result in failure.  If the
child COW happens first, an attempt will be made to allocate the page
without reserves and the child will get SIGKILLed on failure.

To summarise the new behaviour:

1. If the original mapper performs COW on a private mapping with multiple
   references, it will attempt to allocate a hugepage from the pool or
   the buddy allocator without using the existing reserves. On fail, VMAs
   mapping the same area are traversed and the page being COW'd is unmapped
   where found. It will then steal the original page as the last mapper in
   the normal way.

2. The VMAs the pages were unmapped from are flagged to note that pages
   with data no longer exist. Future no-page faults on those VMAs will
   terminate the process as otherwise it would appear that data was corrupted.
   A warning is printed to the console that this situation occured.

2. If the child performs COW first, it will attempt to satisfy the COW
   from the pool if there are enough pages or via the buddy allocator if
   overcommit is allowed and the buddy allocator can satisfy the request. If
   it fails, the child will be killed.

If the pool is large enough, existing applications will not notice that
the reserves were a factor.  Existing applications depending on the
no-reserves been set are unlikely to exist as for much of the history of
hugetlbfs, pages were prefaulted at mmap(), allocating the pages at that
point or failing the mmap().

[npiggin@suse.de: fix CONFIG_HUGETLB=n build]
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:16 -07:00
Mel Gorman
a1e78772d7 hugetlb: reserve huge pages for reliable MAP_PRIVATE hugetlbfs mappings until fork()
This patch reserves huge pages at mmap() time for MAP_PRIVATE mappings in
a similar manner to the reservations taken for MAP_SHARED mappings.  The
reserve count is accounted both globally and on a per-VMA basis for
private mappings.  This guarantees that a process that successfully calls
mmap() will successfully fault all pages in the future unless fork() is
called.

The characteristics of private mappings of hugetlbfs files behaviour after
this patch are;

1. The process calling mmap() is guaranteed to succeed all future faults until
   it forks().
2. On fork(), the parent may die due to SIGKILL on writes to the private
   mapping if enough pages are not available for the COW. For reasonably
   reliable behaviour in the face of a small huge page pool, children of
   hugepage-aware processes should not reference the mappings; such as
   might occur when fork()ing to exec().
3. On fork(), the child VMAs inherit no reserves. Reads on pages already
   faulted by the parent will succeed. Successful writes will depend on enough
   huge pages being free in the pool.
4. Quotas of the hugetlbfs mount are checked at reserve time for the mapper
   and at fault time otherwise.

Before this patch, all reads or writes in the child potentially needs page
allocations that can later lead to the death of the parent.  This applies
to reads and writes of uninstantiated pages as well as COW.  After the
patch it is only a write to an instantiated page that causes problems.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Adam Litke <agl@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:16 -07:00
Johannes Weiner
9109fb7b35 mm: drop unneeded pgdat argument from free_area_init_node()
free_area_init_node() gets passed in the node id as well as the node
descriptor.  This is redundant as the function can trivially get the node
descriptor itself by means of NODE_DATA() and the node's id.

I checked all the users and NODE_DATA() seems to be usable everywhere
from where this function is called.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:16 -07:00
Andrew Morton
2185e69f68 mapping_set_error: add unlikely()
This is called on a per-page basis and in the vast majority of cases
`error' is zero.

Cc: Guillaume Chazarain <guichaz@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Andy Whitcroft
9023cb7e85 slob: record page flag overlays explicitly
SLOB reuses two page bits for internal purposes, it overlays PG_active and
PG_private.  This is hidden away in slob.c.  Document these overlays
explicitly in the main page-flags enum along with all the others.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Andy Whitcroft
8a38082d21 slub: record page flag overlays explicitly
SLUB reuses two page bits for internal purposes, it overlays PG_active and
PG_error.  This is hidden away in slub.c.  Document these overlays
explicitly in the main page-flags enum along with all the others.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Andy Whitcroft
0cad47cf13 page-flags: record page flag overlays explicitly
With the recent page flag reorganisation we have a single enum which
defines the valid page flags and their values, nice and clear.  However
there are a number of bits which are overloaded by different subsystems.
Firstly there is PG_owner_priv_1 which is used by filesystems and by XEN.
Secondly both SLOB and SLUB use a couple of extra page bits to manage
internal state for pages they own; both overlay other bits.  All of these
"aliases" are scattered about the source making it very hard for a reader
to know if the bits are safe to rely on in all contexts; confusion here is
bad.

As we now have a single place where the bits are clearly assigned it makes
sense to clarify the reuse of bits by making the aliases explicit and
visible with the original bit assignments.  This patch creates explicit
aliases within the enum itself for the overloaded bits, creates standard
bit accessors PageFoo etc.  and uses those throughout.

This version pulls the bit manipulation out to standard named page bit
accessors as suggested by Christoph, it retains the explicit mapping to
the overlayed bits.  A fusion of both ideas.  This has been SLUB and SLOB
have been compile tested on x86_64 only, and SLUB boot tested.  If people
feel this is worth doing then I can run a fuller set of testing.

This patch:

Some page flags are used for more than one purpose, for example
PG_owner_priv_1.  Currently there are individual accessors for each user,
each built using the common flag name far away from the bit definitions.
This makes it hard to see all possible uses of these bits.

Now that we have a single enum to generate the bit orders it makes sense
to express overlays in the same place.  So create per use aliases for this
bit in the main page-flags enum and use those in the accessors.

[akpm@linux-foundation.org: fix xen]
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Kentaro Makita
da3bbdd463 fix soft lock up at NFS mount via per-SB LRU-list of unused dentries
[Summary]

 Split LRU-list of unused dentries to one per superblock to avoid soft
 lock up during NFS mounts and remounting of any filesystem.

 Previously I posted here:
 http://lkml.org/lkml/2008/3/5/590

[Descriptions]

- background

  dentry_unused is a list of dentries which are not referenced.
  dentry_unused grows up when references on directories or files are
  released.  This list can be very long if there is huge free memory.

- the problem

  When shrink_dcache_sb() is called, it scans all dentry_unused linearly
  under spin_lock(), and if dentry->d_sb is differnt from given
  superblock, scan next dentry.  This scan costs very much if there are
  many entries, and very ineffective if there are many superblocks.

  IOW, When we need to shrink unused dentries on one dentry, but scans
  unused dentries on all superblocks in the system.  For example, we scan
  500 dentries to unmount a filesystem, but scans 1,000,000 or more unused
  dentries on other superblocks.

  In our case , At mounting NFS*, shrink_dcache_sb() is called to shrink
  unused dentries on NFS, but scans 100,000,000 unused dentries on
  superblocks in the system such as local ext3 filesystems.  I hear NFS
  mounting took 1 min on some system in use.

* : NFS uses virtual filesystem in rpc layer, so NFS is affected by
  this problem.

  100,000,000 is possible number on large systems.

  Per-superblock LRU of unused dentried can reduce the cost in
  reasonable manner.

- How to fix

  I found this problem is solved by David Chinner's "Per-superblock
  unused dentry LRU lists V3"(1), so I rebase it and add some fix to
  reclaim with fairness, which is in Andrew Morton's comments(2).

  1) http://lkml.org/lkml/2006/5/25/318
  2) http://lkml.org/lkml/2006/5/25/320

  Split LRU-list of unused dentries to each superblocks.  Then, NFS
  mounting will check dentries under a superblock instead of all.  But
  this spliting will break LRU of dentry-unused.  So, I've attempted to
  make reclaim unused dentrins with fairness by calculate number of
  dentries to scan on this sb based on following way

  number of dentries to scan on this sb =
  count * (number of dentries on this sb / number of dentries in the machine)

- ToDo
 - I have to measuring performance number and do stress tests.

 - When unmount occurs during prune_dcache(), scanning on same
  superblock, It is unable to reach next superblock because it is gone
  away.  We restart scannig superblock from first one, it causes
  unfairness of reclaim unused dentries on first superblock.  But I think
  this happens very rarely.

- Test Results

  Result on 6GB boxes with excessive unused dentries.

Without patch:

$ cat /proc/sys/fs/dentry-state
10181835        10180203        45      0       0       0
# mount -t nfs 10.124.60.70:/work/kernel-src nfs
real    0m1.830s
user    0m0.001s
sys     0m1.653s

 With this patch:
$ cat /proc/sys/fs/dentry-state
10236610        10234751        45      0       0       0
# mount -t nfs 10.124.60.70:/work/kernel-src nfs
real    0m0.106s
user    0m0.002s
sys     0m0.032s

[akpm@linux-foundation.org: fix comments]
Signed-off-by: Kentaro Makita <k-makita@np.css.fujitsu.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: David Chinner <dgc@sgi.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Jan Beulich
42b7772812 mm: remove double indirection on tlb parameter to free_pgd_range() & Co
The double indirection here is not needed anywhere and hence (at least)
confusing.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Benjamin Herrenschmidt
a1f242ff46 powerpc ioremap_prot
This adds ioremap_prot and pte_pgprot() so that one can extract protection
bits from a PTE and use them to ioremap_prot() (in order to support ptrace
of VM_IO | VM_PFNMAP as per Rik's patch).

This moves a couple of flag checks around in the ioremap implementations
of arch/powerpc.  There's a side effect of allowing non-cacheable and
non-guarded mappings on ppc32 which before would always have _PAGE_GUARDED
set whenever _PAGE_NO_CACHE is.

(standard ioremap will still set _PAGE_GUARDED, but ioremap_prot will be
capable of setting such a non guarded mapping).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Rik van Riel
28b2ee20c7 access_process_vm device memory infrastructure
In order to be able to debug things like the X server and programs using
the PPC Cell SPUs, the debugger needs to be able to access device memory
through ptrace and /proc/pid/mem.

This patch:

Add the generic_access_phys access function and put the hooks in place
to allow access_process_vm to access device or PPC Cell SPU memory.

[riel@redhat.com: Add documentation for the vm_ops->access function]
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Benjamin Herrensmidt <benh@kernel.crashing.org>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Nick Piggin
0d71d10a42 mm: remove nopfn
There are no users of nopfn in the tree. Remove it.

[hugh@veritas.com: fix build error]
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:15 -07:00
Adrian Bunk
c748e1340e mm/vmstat.c: proper externs
This patch adds proper extern declarations for five variables in
include/linux/vmstat.h

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:14 -07:00
KOSAKI Motohiro
e4048e5dc4 page allocator: inline some __alloc_pages() wrappers
Two zonelist patch series rewrote __page_alloc() largely.  Now, it is just
a wrapper function.  Inlining them will save a function call.

[akpm@linux-foundation.org: export __alloc_pages_internal]
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:14 -07:00
Johannes Weiner
ffc6421f07 mm: unexport __alloc_bootmem_core()
This function has no external callers, so unexport it.  Also fix its naming
inconsistency.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:14 -07:00
Johannes Weiner
b61bfa3c46 mm: move bootmem descriptors definition to a single place
There are a lot of places that define either a single bootmem descriptor or an
array of them.  Use only one central array with MAX_NUMNODES items instead.

Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kyle McMartin <kyle@parisc-linux.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:14 -07:00
FUJITA Tomonori
8b05c7e6e1 add a helper function to test if an object is on the stack
lib/debugobjects.c has a function to test if an object is on the stack.
The block layer and ide needs it (they need to avoid DMA from/to stack
buffers).  This patch moves the function to include/linux/sched.h so that
everyone can use it.

lib/debugobjects.c uses current->stack but this patch uses a
task_stack_page() accessor, which is a preferable way to access the stack.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:14 -07:00
Akinobu Mita
e108526e77 move memory_read_from_buffer() from fs.h to string.h
James Bottomley warns that inclusion of linux/fs.h in a low level
driver was always a danger signal.  This patch moves
memory_read_from_buffer() from fs.h to string.h and fixes includes in
existing memory_read_from_buffer() users.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Bob Moore <robert.moore@intel.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-24 10:47:13 -07:00
Mauro Carvalho Chehab
698158c799 Merge ../linux-2.6 2008-07-24 14:05:50 -03:00
Roland Dreier
2cc177364e Merge branches 'bkl-removal', 'cma', 'ehca', 'for-2.6.27', 'mlx4', 'mthca' and 'nes' into for-linus 2008-07-24 08:38:47 -07:00
Matthew Wilcox
b552068999 Remove __DECLARE_SEMAPHORE_GENERIC
There are no users of __DECLARE_SEMAPHORE_GENERIC in the kernel

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-07-24 08:31:21 -04:00
Matthew Wilcox
2351ec533e Remove asm/semaphore.h
All users have now been converted to linux/semaphore.h and we don't need
to keep these files around any longer.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2008-07-24 08:31:12 -04:00
Hans-Christian Egtvedt
218df4a25a avr32: Add platform data for AC97C platform device
This patch adds platform data to the AC97C platform device. This will
let the board add a GPIO line which is connected to the external codecs
reset line.

The platform data, ac97c_platform_data, must also contain the DMA
controller ID, RX channel ID and TX channel ID.

Tested with Wolfson WM9712 and AP7000.

Signed-off-by: Hans-Christian Egtvedt <hcegtvedt@atmel.com>
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
2008-07-24 13:51:46 +02:00
Vegard Nossum
04bbe430f7 x86: fix header export, asm-x86/processor-flags.h, CONFIG_* leaks
Apparently,

commit 6330a30a76
Author: Vegard Nossum <vegard.nossum@gmail.com>
Date:   Wed May 28 09:46:19 2008 +0200

    x86: break mutual header inclusion

introduced some CONFIG names to processor-flags.h, which was exported in

commit 6093015db2
Author: Ingo Molnar <mingo@elte.hu>
Date:   Sun Mar 30 11:45:23 2008 +0200

    x86: cleanup replace most vm86 flags with flags from processor-flags.h, fix

Fix it by wrapping the CONFIG parts in __KERNEL__.

Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-24 12:49:53 +02:00
Maciej W. Rozycki
0791e13fbb x86: fix up a comment in ack_APIC_irq()
Adjust a comment in ack_APIC_irq() according to the recent removal of
CONFIG_X86_GOOD_APIC.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-24 12:39:23 +02:00
Artem Bityutskiy
9c9ec14770 UBI: fix checkpatch.pl errors and warnings
Just out or curiousity ran checkpatch.pl for whole UBI,
and discovered there are quite a few of stylistic issues.
Fix them.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-07-24 13:36:09 +03:00
Artem Bityutskiy
f40ac9cdf6 UBI: implement multiple volumes rename
Quite useful ioctl which allows to make atomic system upgrades.
The idea belongs to Richard Titmuss <richard_titmuss@logitech.com>

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-07-24 13:34:46 +03:00
Artem Bityutskiy
85c6e6e282 UBI: amend commentaries
Hch asked not to use "unit" for sub-systems, let it be so.
Also some other commentaries modifications.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-07-24 13:32:56 +03:00
Artem Bityutskiy
a5bf619041 UBI: add ubi_sync() interface
To flush MTD device caches.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
2008-07-24 13:32:56 +03:00
Linus Torvalds
7f9dce3837 Merge branch 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: hrtick_enabled() should use cpu_active()
  sched, x86: clean up hrtick implementation
  sched: fix build error, provide partition_sched_domains() unconditionally
  sched: fix warning in inc_rt_tasks() to not declare variable 'rq' if it's not needed
  cpu hotplug: Make cpu_active_map synchronization dependency clear
  cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)
  sched: rework of "prioritize non-migratable tasks over migratable ones"
  sched: reduce stack size in isolated_cpu_setup()
  Revert parts of "ftrace: do not trace scheduler functions"

Fixed up conflicts in include/asm-x86/thread_info.h (due to the
TIF_SINGLESTEP unification vs TIF_HRTICK_RESCHED removal) and
kernel/sched_fair.c (due to cpu_active_map vs for_each_cpu_mask_nr()
introduction).
2008-07-23 19:36:53 -07:00
Linus Torvalds
26dcce0fab Merge branch 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
  NR_CPUS: Replace NR_CPUS in speedstep-centrino.c
  cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP
  NR_CPUS: Replace NR_CPUS in cpufreq userspace routines
  NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var
  NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c
  NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c
  NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c
  NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c
  cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix
  cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target
  cpumask: Provide a generic set of CPUMASK_ALLOC macros
  cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c
  cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c
  cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c
  cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c
  cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c
  cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
  Revert "cpumask: introduce new APIs"
  cpumask: make for_each_cpu_mask a bit smaller
  net: Pass reference to cpumask variable in net/sunrpc/svc.c
  ...

Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually
2008-07-23 18:37:44 -07:00
Linus Torvalds
d7b6de14a0 Merge branch 'core/softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core/softlockup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  softlockup: fix invalid proc_handler for softlockup_panic
  softlockup: fix watchdog task wakeup frequency
  softlockup: fix watchdog task wakeup frequency
  softlockup: show irqtrace
  softlockup: print a module list on being stuck
  softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression
  softlockup: fix false positives on nohz if CPU is 100% idle for more than 60 seconds
  softlockup: fix softlockup_thresh fix
  softlockup: fix softlockup_thresh unaligned access and disable detection at runtime
  softlockup: allow panic on lockup
2008-07-23 18:34:13 -07:00
Linus Torvalds
30d38542ec Merge branch 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (85 commits)
  [ARM] pxa: add base support for PXA930 Handheld Platform (aka SAAR)
  [ARM] pxa: add base support for PXA930 Evaluation Board (aka TavorEVB)
  [ARM] pxa: add base support for PXA930 (aka Tavor-P)
  [ARM] Update mach-types
  [ARM] pxa: make littleton to use the new smc91x platform data
  [ARM] pxa: make zylonite to use the new smc91x platform data
  [ARM] pxa: make mainstone to use the new smc91x platform data
  [ARM] pxa: make lubbock to use new smc91x platform data
  [NET] smc91x: prepare SMC_USE_PXA_DMA to be specified in platform data
  [NET] smc91x: prepare for SMC_IO_SHIFT to be a platform configurable variable
  [NET] smc91x: add SMC91X_NOWAIT flag to platform data
  [NET] smc91x: favor the use of SMC91X_USE_* instead of SMC_CAN_USE_*
  [NET] smc91x: remove "irq_flags" from "struct smc91x_platdata"
  [ARM] 5146/1: pxa2xx: convert all boards to call pxa2xx_transceiver_mode helper
  Support for LCD on e740 e750 e400 and e800 e-series PDAs
  E-series UDC support
  PXA UDC - allow use of inverted GPIO for pullup
  Add e350 support
  Fix broken e-series build
  E-series GPIO / IRQ definitions.
  ...
2008-07-23 18:24:08 -07:00
Mauro Carvalho Chehab
2864462eaf V4L/DVB (8434): Fix x86_64 compilation and move some macros to v4l2-ioctl.h
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-23 19:00:22 -03:00
Hans Verkuil
35ea11ff84 V4L/DVB (8430): videodev: move some functions from v4l2-dev.h to v4l2-common.h or v4l2-ioctl.h
The functions in a header should not belong to another module. The prio functions
belong to v4l2-common.c, so move them to v4l2-common.h.

The ioctl functions belong to v4l2-ioctl.c, so create a new v4l2-ioctl.h header
and move those functions to it.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-23 19:00:17 -03:00
Krzysztof Hałasa
efa415840d Remove bogus variables from syncppp.[ch]
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
2008-07-23 23:00:31 +02:00
Dan Williams
d8e64406a0 md: delay notification of 'active_idle' to the recovery thread
sysfs_notify might sleep, so do not call it from md_safemode_timeout.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-23 13:09:48 -07:00
Hans Verkuil
22a04f1063 V4L/DVB (8429): videodev: renamed 'class_dev' to 'dev'
The class_dev field is a normal device, not a class device. This is very
confusing and now that the old 'dev' field has been renamed to 'parent'
we can rename 'class_dev' to just 'dev'.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-23 16:42:52 -03:00
Hans Verkuil
5e85e732f0 V4L/DVB (8428): videodev: rename 'dev' to 'parent'
The field 'dev' is not the video device, but the parent of the video device.
Rename accordingly.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-23 16:42:49 -03:00
Linus Torvalds
20b7997e8a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
  sdhci: highmem capable PIO routines
  sg: reimplement sg mapping iterator
  mmc_test: print message when attaching to card
  mmc: Remove Russell as primecell mci maintainer
  mmc_block: bounce buffer highmem support
  sdhci: fix bad warning from commit c8b3e02
  sdhci: add warnings for bad buffers in ADMA path
  mmc_test: test oversized sg lists
  mmc_test: highmem tests
  s3cmci: ensure host stopped on machine shutdown
  au1xmmc: suspend/resume implementation
  s3cmci: fixes for section mismatch warnings
  pxamci: trivial fix of DMA alignment register bit clearing
2008-07-23 12:04:34 -07:00
Linus Torvalds
5554b35933 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (24 commits)
  I/OAT: I/OAT version 3.0 support
  I/OAT: tcp_dma_copybreak default value dependent on I/OAT version
  I/OAT: Add watchdog/reset functionality to ioatdma
  iop_adma: cleanup iop_chan_xor_slot_count
  iop_adma: document how to calculate the minimum descriptor pool size
  iop_adma: directly reclaim descriptors on allocation failure
  async_tx: make async_tx_test_ack a boolean routine
  async_tx: remove depend_tx from async_tx_sync_epilog
  async_tx: export async_tx_quiesce
  async_tx: fix handling of the "out of descriptor" condition in async_xor
  async_tx: ensure the xor destination buffer remains dma-mapped
  async_tx: list_for_each_entry_rcu() cleanup
  dmaengine: Driver for the Synopsys DesignWare DMA controller
  dmaengine: Add slave DMA interface
  dmaengine: add DMA_COMPL_SKIP_{SRC,DEST}_UNMAP flags to control dma unmap
  dmaengine: Add dma_client parameter to device_alloc_chan_resources
  dmatest: Simple DMA memcpy test client
  dmaengine: DMA engine driver for Marvell XOR engine
  iop-adma: fix platform driver hotplug/coldplug
  dmaengine: track the number of clients using a channel
  ...

Fixed up conflict in drivers/dca/dca-sysfs.c manually
2008-07-23 12:03:18 -07:00
Linus Torvalds
0f6e38a638 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kgdb: kgdboc console poll hooks for mpsc uart
  kgdb: kgdboc console poll hooks for cpm uart
  kgdb, powerpc: arch specific powerpc kgdb support
  kgdb: support for ARCH=arm
  kgdb: remove unused HAVE_ARCH_KGDB_SHADOW_INFO config variable
2008-07-23 11:59:37 -07:00
Linus Torvalds
e669e8179d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (60 commits)
  ide: small whitespace fixes
  ide: ide-cd_ioctl.c fix sparse integer as NULL pointer warnings
  ide: ide-cd.c fix sparse endianness warnings
  ide-cd: convert to using the new atapi_flags
  ide: remove unused PC_FLAG_DRQ_INTERRUPT
  ide-scsi: convert to using the new atapi_flags
  ide-tape: convert to using the new atapi_flags
  ide-floppy: convert to using the new atapi_flags (take 2)
  ide: add per-device flags
  ide: use rq->cmd instead of pc->c in atapi common code
  ide-scsi: pass packet command in rq->cmd
  ide-tape: pass packet command in rq->cmd
  ide-tape: make room for packet command ids in rq->cmd
  ide-floppy: pass packet command in rq->cmd
  ide: remove pc->callback member from ide_atapi_pc
  ide-scsi: use drive->pc_callback instead of pc->callback
  ide-tape: use drive->pc_callback instead of pc->callback
  ide-floppy: use drive->pc_callback instead of pc->callback
  ide: push pc callback pointer into the ide_drive_t structure
  drivers/ide/ide-tape.c: remove double kfree
  ...
2008-07-23 11:59:09 -07:00
Dmitry Torokhov
a822bea796 Input: serio - mark serio_register_driver() __must_check
Also remove extra declaration of serio_register_driver().

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-07-23 14:01:49 -04:00
Borislav Petkov
ac77ef8b03 ide: remove unused PC_FLAG_DRQ_INTERRUPT
There should be no functionality change resulting from this patch.

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:56:01 +02:00
Borislav Petkov
ea68d270ff ide-floppy: convert to using the new atapi_flags (take 2)
while at it, remove PC_FLAG_ZIP_DRIVE from the packed command flags altogether
and query the drive type through drive->atapi_flags.

v2:
ide-floppy fix.

There should be no functionality change resulting from this patch.

[bart: IDE_FLAG_* -> IDE_AFLAG_*, dev_flags -> atapi_flags]

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:56:01 +02:00
Borislav Petkov
3b8ac5398c ide: add per-device flags
Push device flags up into ide_drive_t.

There should be no functionality change resulting from this patch.

[bart: IDE_FLAG_* -> IDE_AFLAG_*, dev_flags -> atapi_flags]

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:56:01 +02:00
Borislav Petkov
8bcda3bc49 ide: remove pc->callback member from ide_atapi_pc
There should be no functionality change resulting from this patch.

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:56:00 +02:00
Borislav Petkov
d7c26ebb5b ide: push pc callback pointer into the ide_drive_t structure
Refrain from carrying the callback ptr with every packet command since the
callback function is only one anyways. ide_drive_t is probably not the most
suitable place for it right now but is the more sane solution. Besides, these
structs are going to be reorganized anyways during the generic ide rewrite.

Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:59 +02:00
Bartlomiej Zolnierkiewicz
8a69580e1e ide: add ide_host_free() helper (take 2)
* Add ide_host_free() helper and convert ide_host_remove() to use it.

* Fix handling of ide_host_register() failure in ide_host_add(),
  icside.c, ide-generic.c, falconide.c and sgiioc4.c.

While at it:

* Fix handling of ide_host_alloc_all() failure in ide-generic.c.

* Fix handling of ide_host_alloc() failure in falconide.c
  (also return the correct error value if no device is found).

v2:
* falconide build fix. (From Stephen Rothwell)

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:59 +02:00
Bartlomiej Zolnierkiewicz
6f904d0152 ide: add ide_host_add() helper
Add ide_host_add() helper which does ide_host_alloc()+ide_host_register(),
then convert ide_setup_pci_device[s](), ide_legacy_device_add() and some
host drivers to use it.

While at it:

* Fix ide_setup_pci_device[s](), ide_arm.c, gayle.c, ide-4drives.c,
  macide.c, q40ide.c, cmd640.c and cs5520.c to return correct error value.

* -ENOENT -> -ENOMEM in rapide.c, ide-h8300.c, ide-generic.c, au1xxx-ide.c
  and pmac.c

* -ENODEV -> -ENOMEM in palm_bk3710.c, ide_platform.c and delkin_cb.c

* -1 -> -ENOMEM in ide-pnp.c

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:57 +02:00
Bartlomiej Zolnierkiewicz
48c3c10726 ide: add struct ide_host (take 3)
* Add struct ide_host which keeps pointers to host's ports.

* Add ide_host_alloc[_all]() and ide_host_remove() helpers.

* Pass 'struct ide_host *host' instead of 'u8 *idx' to
  ide_device_add[_all]() and rename it to ide_host_register[_all]().

* Convert host drivers and core code to use struct ide_host.

* Remove no longer needed ide_find_port().

* Make ide_find_port_slot() static.

* Unexport ide_unregister().

v2:
* Add missing 'struct ide_host *host' to macide.c.

v3:
* Fix build problem in pmac.c (s/ide_alloc_host/ide_host_alloc/)
  (Noticed by Stephen Rothwell).

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:57 +02:00
Bartlomiej Zolnierkiewicz
374e042c3e ide: add struct ide_tp_ops (take 2)
* Add struct ide_tp_ops for transport methods.

* Add 'const struct ide_tp_ops *tp_ops' to struct ide_port_info
  and ide_hwif_t.

* Set the default hwif->tp_ops in ide_init_port_data().

* Set host driver specific hwif->tp_ops in ide_init_port().

* Export ide_exec_command(), ide_read_status(), ide_read_altstatus(),
  ide_read_sff_dma_status(), ide_set_irq(), ide_tf_{load,read}()
  and ata_{in,out}put_data().

* Convert host drivers and core code to use struct ide_tp_ops.

* Remove no longer needed default_hwif_transport().

* Cleanup ide_hwif_t from methods that are now in struct ide_tp_ops.

While at it:

* Use struct ide_port_info in falconide.c and q40ide.c.

* Rename ata_{in,out}put_data() to ide_{in,out}put_data().

v2:

* Fix missing convertion in ns87415.c.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:56 +02:00
Bartlomiej Zolnierkiewicz
d6276b5f5c ide: add 'config' field to hw_regs_t
Add 'config' field to hw_regs_t and use it to set hwif->config_data in
ide_init_port_hw(), then convert ide_legacy_init_one() to use hw->config.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:56 +02:00
Bartlomiej Zolnierkiewicz
3b2a5c7149 ide: filter out "default" transfer mode values in set_xfer_rate()
* Filter out "default" transfer mode values (0x00 - default PIO mode,
  0x01 - default PIO mode w/ IORDY disabled) in write handler for obsoleted
  /proc/ide/hd?/settings:current_speed setting.

  Allowing "default" transfer mode values is a dangerous thing to do as
  we don't support programming controller to the "default" transfer mode
  and devices often use different values for the default and maximum PIO
  mode (i.e. PIO2 default and PIO4 maximum) so the controller will stay
  programmed for higher PIO mode while device will use the lower PIO mode.

  There is no functionality loss as by using special IOCTLs device can
  still be programmed to "default" transfer modes (it is only useful for
  debugging/testing purposes anyway).

* Remove no longer needed IDE_HFLAG_ABUSE_SET_DMA_MODE host flag, it was
  previously used by few host drivers to program the controller to PIO0
  timings for "default" transfer mode == 0x01 (although some host drivers
  would program invalid PIO timings instead).

* Cleanup ide_set_xfer_rate() and add BUG_ON().

Suggested-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:56 +02:00
Bartlomiej Zolnierkiewicz
ba4b2e607e ide: remove dead Virtual DMA support
Lets remove dead Virtual DMA support for now so it doesn't clutter
core IDE code (it can be bring back when there is a need for it):

* Remove IDE_HFLAG_VDMA host flag.

* Remove ide_drive_t.vdma flag.

* cs5520.c: remove stale FIXMEs, cs5520_dma_host_set() and cs5520_dma_ops
  (also there is no longer a need to set IDE_HFLAG_NO_ATAPI_DMA).

There should be no functional changes caused by this patch.

Cc: TAKADA Yoshihito <takada@mbf.nifty.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:55 +02:00
Bartlomiej Zolnierkiewicz
761052e676 ide: remove ->INB, ->OUTB and ->OUTBSYNC methods
* Remove no longer needed ->INB, ->OUTB and ->OUTBSYNC methods.

Then:

* Remove no longer used default_hwif_[mm]iops() and ide_[mm_]outbsync().

* Cleanup SuperIO handling in ns87415.c.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:54 +02:00
Bartlomiej Zolnierkiewicz
1823649b5a ide: add ide_read_bcount_and_ireason() helper
Add ide_read_bcount_and_ireason() helper and use it instead of ->INB
in {cdrom_newpc,ide_pc}_intr().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:54 +02:00
Bartlomiej Zolnierkiewicz
92eb43800a ide: use ->tf_read in ide_read_error()
* Add IDE_TFLAG_IN_FEATURE taskfile flag for reading Feature
  register and handle it in ->tf_read.

* Convert ide_read_error() to use ->tf_read instead of ->INB,
  then uninline and export it.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:53 +02:00
Bartlomiej Zolnierkiewicz
6e6afb3b74 ide: add ->set_irq method
Add ->set_irq method for setting nIEN bit of ATA Device Control
register and use it instead of ide_set_irq().

While at it:

* Use ->set_irq in init_irq() and do_reset1().

* Don't use HWIF() macro in ide_check_pm_state().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:52 +02:00
Bartlomiej Zolnierkiewicz
1f6d8a0fd8 ide: add ->read_altstatus method
* Remove ide_read_altstatus() inline helper.

* Add ->read_altstatus method for reading ATA Alternate Status
  register and use it instead of ->INB.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:52 +02:00
Bartlomiej Zolnierkiewicz
b73c7ee25d ide: add ->read_status method
* Remove ide_read_status() inline helper.

* Add ->read_status method for reading ATA Status register
  and use it instead of ->INB.

While at it:

* Don't use HWGROUP() macro.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:52 +02:00
Bartlomiej Zolnierkiewicz
c6dfa867bb ide: add ->exec_command method
Add ->exec_command method for writing ATA Command register
and use it instead of ->OUTBSYNC.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz
ebb00fb55d ide: factor out simplex handling from ide_pci_dma_base()
* Factor out simplex handling from ide_pci_dma_base() to
  ide_pci_check_simplex().

* Set hwif->dma_base early in ->init_dma method / ide_hwif_setup_dma()
  and reset it in ide_init_port() if DMA initialization fails.

* Use ->read_sff_dma_status instead of ->INB in ide_pci_dma_base().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz
81e8d5a34f ide: remove ide_setup_dma()
Export sff_dma_ops and then remove ide_setup_dma().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz
cab7f8eda4 ide: remove ->dma_{status,command} fields from ide_hwif_t
* Use ->dma_base + offset instead of ->dma_{status,command}
  and remove no longer needed ->dma_{status,command}.

While at it:

* Use ATA_DMA_* defines.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:51 +02:00
Bartlomiej Zolnierkiewicz
b2f951aabc ide: add ->read_sff_dma_status method
Add ->read_sff_dma_status method for reading DMA Status register
and use it instead of ->INB.

While at it:

* Use inb() directly in ns87415.c::ns87415_dma_end().

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:50 +02:00
Bartlomiej Zolnierkiewicz
c97c6aca75 ide: pass hw_regs_t-s to ide_device_add[_all]() (take 3)
* Add 'hw_regs_t **hws' argument to ide_device_add[_all]() and convert
  host drivers + ide_legacy_init_one() + ide_setup_pci_device[s]() to use
  it instead of calling ide_init_port_hw() directly.

  [ However if host has > 1 port we must still set hwif->chipset to hint
    consecutive ide_find_port() call that the previous slot is occupied. ]

* Unexport ide_init_port_hw().

v2:
* Use defines instead of hard-coded values in buddha.c, gayle.c and q40ide.c.
  (Suggested by Geert Uytterhoeven)

* Better patch description.

v3:
* Fix build problem in ide-cs.c. (Noticed by Stephen Rothwell)

There should be no functional changes caused by this patch.

Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-23 19:55:50 +02:00
Jason Wessel
17ce452f7e kgdb, powerpc: arch specific powerpc kgdb support
This patch removes the old kgdb reminants from ARCH=powerpc and
implements the new style arch specific stub for the common kgdb core
interface.

It is possible to have xmon and kgdb in the same kernel, but you
cannot use both at the same time because there is only one set of
debug hooks.

The arch specific kgdb implementation saves the previous state of the
debug hooks and restores them if you unconfigure the kgdb I/O driver.
Kgdb should have no impact on a kernel that has no kgdb I/O driver
configured.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2008-07-23 11:30:15 -05:00
Jason Wessel
5cbad0ebf4 kgdb: support for ARCH=arm
This patch adds the ARCH=arm specific a kgdb backend, originally
written by Deepak Saxena <dsaxena@plexity.net> and George Davis
<gdavis@mvista.com>.  Geoff Levand <geoffrey.levand@am.sony.com>,
Nicolas Pitre, Manish Lachwani, and Jason Wessel have contributed
various fixups here as well.

The KGDB patch makes one change to the core ARM architecture such that
the traps are initialized early for use with the debugger or other
subsystems.

[ mingo@elte.hu: small cleanups. ]
[ ben-linux@fluff.org: fixed early_trap_init ]

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Acked-by: Deepak Saxena <dsaxena@plexity.net>
2008-07-23 11:30:15 -05:00
Roland Dreier
95d04f0735 IB/mlx4: Add support for memory management extensions and local DMA L_Key
Add support for the following operations to mlx4 when device firmware
supports them:

 - Send with invalidate and local invalidate send queue work requests;
 - Allocate/free fast register MRs;
 - Allocate/free fast register MR page lists;
 - Fast register MR send queue work requests;
 - Local DMA L_Key.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-23 08:12:26 -07:00
Rafi Rubin
f472f80034 HID: add n-trig digitizer usage
This adds a hid usage that is reported by the N-Trig digitizer in the Dell
Latitude XT screen.

Signed-off-by: Rafi Rubin <rafi@seas.upenn.edu>
Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2008-07-23 15:25:21 +02:00
Tejun Heo
137d3edb48 sg: reimplement sg mapping iterator
This is alternative implementation of sg content iterator introduced
by commit 83e7d317... from Pierre Ossman in next-20080716.  As there's
already an sg iterator which iterates over sg entries themselves, name
this sg_mapping_iterator.

Slightly edited description from the original implementation follows.

Iteration over a sg list is not that trivial when you take into
account that memory pages might have to be mapped before being used.
Unfortunately, that means that some parts of the kernel restrict
themselves to directly accesible memory just to not have to deal with
the mess.

This patch adds a simple iterator system that allows any code to
easily traverse an sg list and not have to deal with all the details.
The user can decide to consume part of the iteration.  Also, iteration
can be stopped and resumed later if releasing the kmap between
iteration steps is necessary.  These features are useful to implement
piecemeal sg copying for interrupt drive PIO for example.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-23 14:42:09 +02:00
Jaswinder Singh
01eb7858c0 x86: mm/pgtable_32.c declare set_pmd_pfn before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:41:59 +05:30
Jaswinder Singh
e0b7c8192d x86: mm/pageattr.c declare arch_report_meminfo before they get used
declared arch_report_meminfo() in asm-x86/pgtable.h as it will be also accessible by fs/proc/proc_misc.c

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:40:34 +05:30
Jaswinder Singh
70ef56414e x86: mm/fault.c declare do_page_fault before they get used
declared do_page_fault() in asm-x86/trap.h for both X86_32 and X86_64

removed do_invalid_op declaration from mm/fault.c as it is already declared in asm-x86/trap.h

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:36:37 +05:30
Jaswinder Singh
a80495ec92 x86: mm/init_XX.c declare functions before they get used
included <asm/smp.h> in mm/init_32.c for zap_low_mappings()

declared free_initmem() in asm-x86/page_XX.h

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:33:57 +05:30
Jaswinder Singh
8f7db5186c x86: vm86_32.c declare functions before they get used
declared following syscalls in asm-x86/syscalls.h:
sys_vm86old, sys_vm86

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:31:02 +05:30
Jaswinder Singh
2b97df06ce x86: apic_XX.c declare functions before they get used
declared following smp interrupts in asm-x86/hw_irq.h:
smp_apic_timer_interrupt, smp_spurious_interrupt, smp_error_interrupt

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-23 17:13:14 +05:30
Nate Case
f46e9203d9 leds: Add support for Philips PCA955x I2C LED drivers
This driver supports the PCA9550, PCA9551, PCA9552, and PCA9553
LED driver chips.

Signed-off-by: Nate Case <ncase@xes-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
2008-07-23 09:49:56 +01:00
Anton Vorontsov
781a54e766 leds: mark led_classdev.default_trigger as const
LED classdev core doesn't modify memory pointed by the default_trigger,
so mark it as const and we'll able to pass const char *s without getting
compiler warnings.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
2008-07-23 09:49:56 +01:00
Riku Voipio
e14fa82439 leds: Add pca9532 led driver
NXP pca9532 is a LED dimmer/controller attached to i2c bus.  It allows
attaching upto 16 leds which can either be on, off or dimmed and/or blinked
with the two PWM modulators available.

This driver is a "new-style" i2c driver that adheres to the driver model and
implements the led framework api.  Since the leds connected to the driver are
platform specific, it is only useful when platform data is passed to the
driver to define what leds are connected to which pins.

Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
2008-07-23 09:49:56 +01:00
Sebastian Siewior
c1863bed8c m68knommu: remove RPXCLASSIC from the m68k tree
This ifdefs are leftovers from the time as the driver was running
on a ppc.
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2008-07-23 15:11:29 +10:00
Sebastian Siewior
6dbeb456ba m68knommu: add read_barrier_depends() and irqs_disabled_flags()
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/trace.c: In function 'tracing_generic_entry_update':
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/trace.c:802: error: implicit declaration of function 'irqs_disabled_flags'
make[3]: *** [kernel/trace/trace.o] Error 1
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/ftrace.c: In function 'ftrace_list_func':
/home/bigeasy/git/linux-2.6-ftrace/kernel/trace/ftrace.c:61: error: implicit declaration of function 'read_barrier_depends'

Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2008-07-23 15:11:28 +10:00
Sebastian Siewior
e872504b31 m68knommu: add byteswap assembly opcode for ISA A+
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2008-07-23 15:11:28 +10:00
Sebastian Siewior
a6260ef841 m68knommu: add ffs and __ffs plattform which support ISA A+ or ISA C
the ff1 and bitrev opcode appears in ISA C and ISA A+ what isn't
supported by all plattforms. The assembly optimization is automaticly
enabled if the compiler understand the required cpu keyword.
My m5235 seems to boot and run fine so far.

Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
2008-07-23 15:11:28 +10:00
Linus Torvalds
c010b2f76c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (82 commits)
  ipw2200: Call netif_*_queue() interfaces properly.
  netxen: Needs to include linux/vmalloc.h
  [netdrvr] atl1d: fix !CONFIG_PM build
  r6040: rework init_one error handling
  r6040: bump release number to 0.18
  r6040: handle RX fifo full and no descriptor interrupts
  r6040: change the default waiting time
  r6040: use definitions for magic values in descriptor status
  r6040: completely rework the RX path
  r6040: call napi_disable when puting down the interface and set lp->dev accordingly.
  mv643xx_eth: fix NETPOLL build
  r6040: rework the RX buffers allocation routine
  r6040: fix scheduling while atomic in r6040_tx_timeout
  r6040: fix null pointer access and tx timeouts
  r6040: prefix all functions with r6040
  rndis_host: support WM6 devices as modems
  at91_ether: use netstats in net_device structure
  sfc: Create one RX queue and interrupt per CPU package by default
  sfc: Use a separate workqueue for resets
  sfc: I2C adapter initialisation fixes
  ...
2008-07-22 19:09:51 -07:00
Linus Torvalds
e9dd54da0b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
  sparc32: pass -m32 when building vmlinux.lds
  sparc: Fixes the DRM layer build on sparc.
  ide: merge <asm-sparc/ide_64.h> with <asm-sparc/ide_32.h>
  ide: <asm-sparc/ide_64.h>: use __raw_{read,write}w()
  ide: <asm-sparc/ide_32.h>: use __raw_{read,write}w()
  ide: <asm-sparc/ide_64.h>: use %r0 for outw_be()
  sparc64: Do not define BIO_VMERGE_BOUNDARY.
2008-07-22 19:04:22 -07:00
David S. Miller
7cf75262a4 Merge branch 'upstream-davem' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2008-07-22 17:54:47 -07:00
Maciej Sosnowski
7f1b358a23 I/OAT: I/OAT version 3.0 support
This patch adds to ioatdma and dca modules
support for Intel I/OAT DMA engine ver.3 (aka CB3 device).
The main features of I/OAT ver.3 are:
 * 8 single channel DMA devices (8 channels total)
 * 8 DCA providers, each can accept 2 requesters
 * 8-bit TAG values and 32-bit extended APIC IDs

Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-22 17:30:57 -07:00
Stephen Hemminger
3d0f24a74e ipv6: icmp6_dst_gc return change
Change icmp6_dst_gc to return the one value the caller cares about rather
than using call by reference.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:35:50 -07:00
Stephen Hemminger
417f28bb34 netns: dont alloc ipv6 fib timer list
FIB timer list is a trivial size structure, avoid indirection and just
put it in existing ns.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:33:45 -07:00
David S. Miller
428695b898 sparc: Fixes the DRM layer build on sparc.
By providing an ioremap_wc().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:30:55 -07:00
Rafael J. Wysocki
e5899e1b7d PCI PM: make more PCI PM core functionality available to drivers
Make more PCI PM core functionality available to drivers

* Export pci_pme_capable() so that it can be called directly by
  drivers (for example, tg3 needs that).

* Move the state choosing part of pci_prepare_to_sleep() to a
  separate function, pci_target_state(), that can be called directly
  by drivers (for example, tg3 needs that).

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-07-22 14:25:38 -07:00
Adrian Bunk
888c848ed3 ipv6: make struct ipv6_devconf static
struct ipv6_devconf can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:21:58 -07:00
Adrian Bunk
abd0b198ea sctp: make sctp_outq_flush() static
sctp_outq_flush() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:20:45 -07:00
Roland Dreier
47b374752a IB/mlx4: Rename struct mlx4_lso_seg to mlx4_wqe_lso_seg
Make the struct name consistent with other WQE segment struct types
defined in <linux/mlx4/qp.h>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22 14:19:39 -07:00
Adrian Bunk
8086cd451f netns: make get_proc_net() static
get_proc_net() can now become static.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:19:19 -07:00
Amir Vadai
38ca83a588 RDMA/cma: Add RDMA_CM_EVENT_TIMEWAIT_EXIT event
Consumers that want to re-use their QPs in new connections need to
know when the QP has exited the timewait state.  Report the timewait
event through the rdma_cm.

Signed-off-by: Amir Vadai <amirv@mellanox.co.il>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22 14:14:23 -07:00
Or Gerlitz
dd5bdff83b RDMA/cma: Add RDMA_CM_EVENT_ADDR_CHANGE event
Add an RDMA_CM_EVENT_ADDR_CHANGE event can be used by rdma-cm
consumers that wish to have their RDMA sessions always use the same
links (eg <hca/port>) as the IP stack does.  In the current code, this
does not happen when bonding is used and fail-over happened but the IB
link used by an already existing session is operating fine.

Use the netevent notification for sensing that a change has happened
in the IP stack, then scan the rdma-cm ID list to see if there is an
ID that is "misaligned" with respect to the IP stack, and deliver
RDMA_CM_EVENT_ADDR_CHANGE for this ID.  The consumer can act on the
event or just ignore it.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-22 14:14:22 -07:00
Dave Jones
d29f749e25 net: Fix build failure with 'make mandocs'.
The function header comments have to go with the functions
they are documenting, or things go horribly wrong when we
try to process them with the docbook tools.

Warning(include/linux/netdevice.h:1006): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1033): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1067): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1093): No description found for parameter 'dev_queue'
Warning(include/linux/netdevice.h:1474): No description found for parameter 'txq'
Error(net/core/dev.c:1674): cannot understand prototype: 'u32 simple_tx_hashrnd; '

Signed-off-by: Dave Jones <davej@redhat.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 14:09:06 -07:00
Linus Torvalds
0988c37c24 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix crash due to missing debugctlmsr on AMD K6-3
  x86: add PTE_FLAGS_MASK
  x86: rename PTE_MASK to PTE_PFN_MASK
  x86: fix pte_flags() to only return flags, fix lguest (updated)
  x86: use setup_clear_cpu_cap with disable_apic, fix
  x86: move the last Dprintk instance to pr_debug()
2008-07-22 13:40:24 -07:00
Linus Torvalds
6eaaaac974 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  remove CONFIG_KMOD from core kernel code
  remove CONFIG_KMOD from lib
  remove CONFIG_KMOD from sparc64
  rework try_then_request_module to do less in non-modular kernels
  remove mention of CONFIG_KMOD from documentation
  make CONFIG_KMOD invisible
  modules: Take a shortcut for checking if an address is in a module
  module: turn longs into ints for module sizes
  Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs
  module: reorder struct module to save space on 64 bit builds
  module: generic each_symbol iterator function
  module: don't use stop_machine for waiting rmmod
2008-07-22 13:17:15 -07:00
Linus Torvalds
06b8147c5d Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (49 commits)
  powerpc: Fix build bug with binutils < 2.18 and GCC < 4.2
  powerpc/eeh: Don't panic when EEH_MAX_FAILS is exceeded
  fbdev: Teaches offb about palette on radeon r5xx/r6xx
  powerpc/cell/edac: Log a syndrome code in case of correctable error
  powerpc/cell: Add DMA_ATTR_WEAK_ORDERING dma attribute and use in Cell IOMMU code
  powerpc: Indicate which oprofile counters to use while in compat mode
  powerpc/boot: Change spaces to tabs
  powerpc: Remove duplicate 6xx option in Kconfig
  powerpc: Use PPC_LONG and PPC_LONG_ALIGN in lib/string.S
  powerpc: Use PPC_LONG_ALIGN in uaccess.h
  powerpc: Add a #define for aligning to a long-sized boundary
  powerpc: Fix OF parsing of 64 bits PCI addresses
  powerpc: Use WARN_ON(1) instead of __WARN()
  powerpc: Fix support for latencytop
  powerpc/ps3: Update ps3_defconfig
  powerpc/ps3: Add a sub-match id to ps3_system_bus
  powerpc: Add a 6xx defconfig
  powerpc/dma: Use the struct dma_attrs in iommu code
  powerpc/cell: Add support for power button of future IBM cell blades
  powerpc/cell: Cleanup sysreset_hack for IBM cell blades
  ...
2008-07-22 13:16:01 -07:00
Linus Torvalds
53baaaa968 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (79 commits)
  arm: bus_id -> dev_name() and dev_set_name() conversions
  sparc64: fix up bus_id changes in sparc core code
  3c59x: handle pci_name() being const
  MTD: handle pci_name() being const
  HP iLO driver
  sysdev: Convert the x86 mce tolerant sysdev attribute to generic attribute
  sysdev: Add utility functions for simple int/ulong variable sysdev attributes
  sysdev: Pass the attribute to the low level sysdev show/store function
  driver core: Suppress sysfs warnings for device_rename().
  kobject: Transmit return value of call_usermodehelper() to caller
  sysfs-rules.txt: reword API stability statement
  debugfs: Implement debugfs_remove_recursive()
  HOWTO: change email addresses of James in HOWTO
  always enable FW_LOADER unless EMBEDDED=y
  uio-howto.tmpl: use unique output names
  uio-howto.tmpl: use standard copyright/legal markings
  sysfs: don't call notify_change
  sysdev: fix debugging statements in registration code.
  kobject: should use kobject_put() in kset-example
  kobject: reorder kobject to save space on 64 bit builds
  ...
2008-07-22 13:13:47 -07:00
Laurent Pinchart
217d5a5195 fs_enet: Remove unused fields in the fs_mii_bb_platform_info structure.
The mdio_port, mdio_bit, mdc_port and mdc_bit fields in the
fs_mii_bb_platform_info structure are left-overs from the move to the Phy
Abstraction Layer subsystem. They are not used anymore and can be safely
removed.

Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-07-22 16:08:58 -04:00
Paul Fulghum
e5590717af synclink_gt: add serial bit order control
Add control of hardware serial bit order between LSB first
(default/standard) and MSB first.

Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-22 13:03:29 -07:00
Alan Cox
9e98966c7b tty: rework break handling
Some hardware needs to do break handling itself and may have partial
support only. Make break_ctl return an error code. Add a tty driver flag
so you can indicate driver hardware side break support.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-22 13:03:28 -07:00
Alan Cox
01e1abb2c2 tty: Split ldisc code into its own file
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-22 13:03:27 -07:00
Alan Cox
95da310e66 usb_serial: API all change
USB serial likes to use port->tty back pointers for the real work it does and
to do so without any actual locking. Unfortunately when you consider hangup
events, hangup/parallel reopen or even worse hangup followed by parallel close
events the tty->port and port->tty pointers are not guaranteed to be the same
as port->tty is the active tty while tty->port is the port the tty may or
may not still be attached to.

So rework the entire API to pass the tty struct. For console cases we need
to pass both for now. This shows up multiple drivers that immediately crash
with USB console some of which have been fixed in the process.

Longer term we need a proper tty as console abstraction

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-22 13:03:22 -07:00
Vegard Nossum
a318631686 x86: consolidate header guards
This patch consolidates the header guard names which are also used
externally, i.e. in .c files.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
2008-07-22 21:53:53 +02:00
Vegard Nossum
77ef50a522 x86: consolidate header guards
This patch is the result of an automatic script that consolidates the
format of all the headers in include/asm-x86/.

The format:

1. No leading underscore. Names with leading underscores are reserved.
2. Pathname components are separated by two underscores. So we can
   distinguish between mm_types.h and mm/types.h.
3. Everything except letters and numbers are turned into single
   underscores.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
2008-07-22 21:31:34 +02:00
Vegard Nossum
a656c8efb4 x86: fix spurious '#' in kvm header
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
2008-07-22 21:27:11 +02:00
Juergen Beisert
0c36ec3147 gpio: gpio driver for max7301 SPI GPIO expander
Maxim's MAX7301 is an SPI GPIO expander with 28 GPIOs.  Note: MAX7301's
interrupt feature is not supported yet.

[akpm@linux-foundation.org: coding-style fixes]
[g.liakhovetski@pengutronix.de: Fix inaccuracies in comments, check spi_setup()
return code, mask off high byte in max7301_read()]
Signed-off-by: Juergen Beisert <j.beisert@pengutronix.de>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-22 09:59:40 -07:00
John Reiser
6519108746 execve filename: document and export via auxiliary vector
The Linux kernel puts the filename argument of execve() into the new
address space.  Many developers are surprised to learn this.  Those who
know and could use it, object "But it's not documented."

Those who want to use it dislike the expression
  (char *)(1+ strlen(env[-1+ n_env]) + env[-1+ n_env])
because it requires locating the last original environment variable,
and assumes that the filename follows the characters.

This patch documents the insertion of the filename, and makes it easier
to find by adding a new tag AT_EXECFN in the ElfXX_auxv_t; see <elf.h>.

In many cases readlink("/proc/self/exe",) gives the same answer.  But if
all the original pages get unmapped, then the kernel erases the symlink
for /proc/self/exe.  This can happen when a program decompressor does a
good job of cleaning up after uncompressing directly to memory, so that
the address space of the target program looks the same as if compression
had never happened.  One example is http://upx.sourceforge.net .

One notable use of the underlying concept (what path containED the
executable) is glibc expanding $ORIGIN in DT_RUNPATH.  In practice for
the near term, it may be a good idea for user-mode code to use both
/proc/self/exe and AT_EXECFN as fall-back methods for each other.
/proc/self/exe can fail due to unmapping, AT_EXECFN can fail because it
won't be present on non-new systems.  The auxvec or {AT_EXECFN}.d_val
also can get overwritten, although in nearly all cases this would be the
result of a bug.

The runtime cost is one NEW_AUX_ENT using two words of stack space.  The
underlying value is maintained already as bprm->exec; setup_arg_pages()
in fs/exec.c slides it for stack_shift, etc.

Signed-off-by: John Reiser <jreiser@BitWagon.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-22 09:59:40 -07:00
Jaswinder Singh
8fd329a1ac x86: common.c declare idle_regs before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:09 +02:00
Jaswinder Singh
1c6c727d9c x86: proc.c declare cpuinfo_op before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:08 +02:00
Jaswinder Singh
c1686aeaf0 x86: ptrace.c declare functions before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:07 +02:00
Jaswinder Singh
36454936c0 x86: i387.c declare dump_fpu() before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:06 +02:00
Jaswinder Singh
791b897ccd x86: pci-nommu.c declare nommu_dma_ops before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:05 +02:00
Jaswinder Singh
9321b8cbbb x86: pci-dma.c declare iommu_bio_merge before they get used
moved iommu_bio_merge from io_64.h to io.h because it is required for both.

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:04 +02:00
Jaswinder Singh
a7b7511ac1 x86: e820.c declare pci_mem_start before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:03 +02:00
Jaswinder Singh
5314d48ed5 x86: setup.c declare saved_video_mode before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:02 +02:00
Jaswinder Singh
cc0384917b x86: time_XX.c declare functions before they get used
Declare time_init() in asm-x86/time.h

Also did cleanup in asm-x86/timer.h :
timer_ack is only required for X86_32
int recalibrate_cpu_khz(void) is for X86_32

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:01 +02:00
Jaswinder Singh
b994b6c033 x86: signal_XX.c declare do_notify_resume before they get used
Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:36:00 +02:00
Jaswinder Singh
fb26132b44 x86: process_32.c declare cpu_number before they get used
Moved DECLARE_PER_CPU(int, cpu_number) from CONFIG_X86_32_SMP to CONFIG_X86_32
because cpu_number is required for both.
And include asm/smp.h in process_32.c

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:35:59 +02:00
Jaswinder Singh
bbc1f698a5 x86: Introducing asm/syscalls.h
Declaring arch-dependent syscalls for x86 architecture

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
2008-07-22 14:35:57 +02:00
Johannes Berg
a1ef5adb4c remove CONFIG_KMOD from core kernel code
Always compile request_module when the kernel allows modules.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-22 19:24:31 +10:00
Johannes Berg
df648c9fbe rework try_then_request_module to do less in non-modular kernels
This reworks try_then_request_module to only invoke the "lookup"
function "x" once when the kernel is not modular.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-22 19:24:29 +10:00
Denys Vlasenko
2f0f2a334b module: turn longs into ints for module sizes
This shrinks module.o and each *.ko file.

And finally, structure members which hold length of module
code (four such members there) and count of symbols
are converted from longs to ints.

We cannot possibly have a module where 32 bits won't
be enough to hold such counts.

For one, module loading checks module size for sanity
before loading, so such insanely big module will fail
that test first.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-22 19:24:27 +10:00
Denys Vlasenko
f7f5b67557 Shrink struct module: CONFIG_UNUSED_SYMBOLS ifdefs
module.c and module.h conatains code for finding
exported symbols which are declared with EXPORT_UNUSED_SYMBOL,
and this code is compiled in even if CONFIG_UNUSED_SYMBOLS is not set
and thus there can be no EXPORT_UNUSED_SYMBOLs in modules anyway
(because EXPORT_UNUSED_SYMBOL(x) are compiled out to nothing then).

This patch adds required #ifdefs.

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-22 19:24:27 +10:00
Richard Kennedy
af5406895a module: reorder struct module to save space on 64 bit builds
reorder struct module to save space on 64 bit builds.
saves 1 cacheline_size  (128 on default x86_64 & 64 on AMD
Opteron/athlon) when CONFIG_MODULE_UNLOAD=y.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-07-22 19:24:26 +10:00
Jeremy Fitzhardinge
77be1fabd0 x86: add PTE_FLAGS_MASK
PTE_PFN_MASK was getting lonely, so I made it a friend.

Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-22 10:43:45 +02:00
Jeremy Fitzhardinge
59438c9fc4 x86: rename PTE_MASK to PTE_PFN_MASK
Rusty, in his peevish way, complained that macros defining constants
should have a name which somewhat accurately reflects the actual
purpose of the constant.

Aside from the fact that PTE_MASK gives no clue as to what's actually
being masked, and is misleadingly similar to the functionally entirely
different PMD_MASK, PUD_MASK and PGD_MASK, I don't really see what the
problem is.

But if this patch silences the incessent noise, then it will have
achieved its goal (TODO: write test-case).

Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-22 10:43:44 +02:00
Rusty Russell
c2e3277f87 x86: fix pte_flags() to only return flags, fix lguest (updated)
(Jeremy said:
	rusty: use PTE_MASK
	rusty: use PTE_MASK
	rusty: use PTE_MASK
 When I asked:
	jsgf: does that include the NX flag?
 He responded eloquently:
	rusty: use PTE_MASK
	rusty: use PTE_MASK
	yes, it's the official constant of masking flags out of ptes
)

Change a15af1c9ea 'x86/paravirt: add
pte_flags to just get pte flags' removed lguest's private pte_flags()
in favor of a generic one.

Unfortunately, the generic one doesn't filter out the non-flags bits:
this results in lguest creating corrupt shadow page tables and blowing
up host memory.

Since noone is supposed to use the pfn part of pte_flags(), it seems
safest to always do the filtering.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-and-morning-tea-spilled-by: Ingo Molnar <mingo@elte.hu>
2008-07-22 10:41:18 +02:00
Grant Likely
a19dd1bd7d powerpc/mpc5200: add PSC SICR bit definitions
Required by the PSC I2S audio driver.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-07-22 01:13:54 -06:00
Jon Smirl
6a4a636fad powerpc/mpc5200: Add AC97 register definitions for the MPC52xx PSC
Needed by the PSC AC97 sound driver

Signed-off-by: Jon Smirl <jonsmirl@gmail.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2008-07-22 01:13:11 -06:00
Benjamin Herrenschmidt
8725f25acc Merge commit 'origin/master'
Manually fixed up:

	drivers/net/fs_enet/fs_enet-main.c
2008-07-22 17:12:37 +10:00
Greg Kroah-Hartman
eadcf0d704 MTD: handle pci_name() being const
This changes the MTD core to handle pci_name() now returning a constant
string.

Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:55:03 -07:00
Andi Kleen
9800794ac1 sysdev: Add utility functions for simple int/ulong variable sysdev attributes
This adds a new sysdev_ext_attribute that stores a pointer to the variable
it manages and some utility functions/macro to easily use them.

Previously all users wrote custom macros to generate show/store
functions for each variable, with this it is possible to avoid
that in many cases.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:55:02 -07:00
Andi Kleen
4a0b2b4dbe sysdev: Pass the attribute to the low level sysdev show/store function
This allow to dynamically generate attributes and share show/store
functions between attributes. Right now most attributes are generated
by special macros and lots of duplicated code. With the attribute
passed it's instead possible to attach some data to the attribute
and then use that in shared low level functions to do different things.

I need this for the dynamically generated bank attributes in the x86
machine check code, but it'll allow some further cleanups.

I converted all users in tree to the new show/store prototype. It's a single
huge patch to avoid unbisectable sections.

Runtime tested: x86-32, x86-64
Compiled only: ia64, powerpc
Not compile tested/only grep converted: sh, arm, avr32

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:55:02 -07:00
Cornelia Huck
36ce6dad6e driver core: Suppress sysfs warnings for device_rename().
driver core: Suppress sysfs warnings for device_rename().

Renaming network devices to an already existing name is not
something we want sysfs to print a scary warning for, since the
callers can deal with this correctly. So let's introduce
sysfs_create_link_nowarn() which gets rid of the common warning.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:55:01 -07:00
Haavard Skinnemoen
9505e63756 debugfs: Implement debugfs_remove_recursive()
debugfs_remove_recursive() will remove a dentry and all its children.
Drivers can use this to zap their whole debugfs tree so that they don't
need to keep track of every single debugfs dentry they created.

It may fail to remove the whole tree in certain cases:

sh-3.2# rmmod atmel-mci < /sys/kernel/debug/mmc0/ios/clock
mmc0: card b368 removed
atmel_mci atmel_mci.0: Lost dma0chan1, falling back to PIO
sh-3.2# ls /sys/kernel/debug/mmc0/
ios

But I'm not sure if that case can be handled in any sane manner.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Cc: Pierre Ossman <drzeus-list@drzeus.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:59 -07:00
Richard Kennedy
a231934bdf kobject: reorder kobject to save space on 64 bit builds
reorder kobject to save space on 64 bit builds.
shrinks from 72 to 64 bytes & moves allocated kobject to a smaller
slab.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:56 -07:00
Uwe Kleine-König
6d8333c24d UIO: minor style and comment fixes
Signed-off-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
Signed-off-by: Hans J. Koch <hjk@linutronix.de>
2008-07-21 21:54:55 -07:00
Hans J. Koch
328a14e70e UIO: Add write function to allow irq masking
Sometimes it is necessary to enable/disable the interrupt of a UIO device
from the userspace part of the driver. With this patch, the UIO kernel driver
can implement an "irqcontrol()" function that does this. Userspace can write
an s32 value to /dev/uioX (usually 0 or 1 to turn the irq off or on). The
UIO core will then call the driver's irqcontrol function.

Signed-off-by: Hans J. Koch <hjk@linutronix.de>
Acked-by: Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
Acked-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:55 -07:00
Greg Kroah-Hartman
b98cb4b7fe driver core: remove DEVICE_ID_SIZE define
There is no such thing as a "device id size" in the driver core, so
remove the define and fix up any users of this odd define in the rest of
the kernel.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:53 -07:00
Kay Sievers
ca52a49846 driver core: remove DEVICE_NAME_SIZE define
There is no such thing as a "device name size" in the driver core, so
remove the define and fix up any users of this odd define in the rest of
the kernel.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:53 -07:00
Kay Sievers
aab0de2451 driver core: remove KOBJ_NAME_LEN define
Kobjects do not have a limit in name size since a while, so stop
pretending that they do.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:52 -07:00
Matthew Wilcox
d2a3b9146e class: add lockdep infrastructure
This adds the infrastructure to properly handle lockdep issues when the
internal class semaphore is changed to a mutex.

Matthew wrote the original patch, and Greg fixed it up to work properly
with the class_create() function.


From: Matthew Wilcox <matthew@wil.cx>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Dave Young <hidave.darkstar@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:52 -07:00
Greg Kroah-Hartman
7c71448b8a class: move driver core specific parts to a private structure
This moves the portions of struct class that are dynamic (kobject and
lock and lists) out of the main structure and into a dynamic, private,
structure.


Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:51 -07:00
Greg Kroah-Hartman
695794ae0c Driver Core: add ability for class_find_device to start in middle of list
This mirrors the functionality that driver_find_device has as well.

We add a start variable, and all callers of the function are fixed up at
the same time.

The block layer will be using this new functionality in a follow-on
patch.


Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:47 -07:00
Greg Kroah-Hartman
93562b5376 Driver Core: add ability for class_for_each_device to start in middle of list
This mirrors the functionality that driver_for_each_device has as well.

We add a start variable, and all callers of the function are fixed up at
the same time.

The block layer will be using this new functionality in a follow-on
patch.


Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:47 -07:00
Greg Kroah-Hartman
4e10673944 device create: convert device_create_drvdata to device_create
Now that device_create() has been audited, rename things back to the
original call to be sane.

Keep the device_create_drvdata macro around to make merges easier.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:47 -07:00
Greg Kroah-Hartman
ccea44fadc driver core: remove device_create()
There are no more users of this, and it is racy.  Use
device_create_drvdata() or device_create_vargs() instead.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:47 -07:00
Dan Williams
e105b8bfc7 sysfs: add /sys/dev/{char,block} to lookup sysfs path by major:minor
Why?:
There are occasions where userspace would like to access sysfs
attributes for a device but it may not know how sysfs has named the
device or the path.  For example what is the sysfs path for
/dev/disk/by-id/ata-ST3160827AS_5MT004CK?  With this change a call to
stat(2) returns the major:minor then userspace can see that
/sys/dev/block/8:32 links to /sys/block/sdc.

What are the alternatives?:
1/ Add an ioctl to return the path: Doable, but sysfs is meant to reduce
   the need to proliferate ioctl interfaces into the kernel, so this
   seems counter productive.

2/ Use udev to create these symlinks: Also doable, but it adds a
   udev dependency to utilities that might be running in a limited
   environment like an initramfs.

3/ Do a full-tree search of sysfs.

[kay.sievers@vrfy.org: fix duplicate registrations]
[kay.sievers@vrfy.org: cleanup suggestions]

Cc: Neil Brown <neilb@suse.de>
Cc: Tejun Heo <htejun@gmail.com>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Reviewed-by: SL Baur <steve@xemacs.org>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Mark Lord <lkml@rtr.ca>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 21:54:40 -07:00
Mark Nelson
1ed6af7344 powerpc/cell: Add DMA_ATTR_WEAK_ORDERING dma attribute and use in Cell IOMMU code
Introduce a new dma attriblue DMA_ATTR_WEAK_ORDERING to use weak ordering
on DMA mappings in the Cell processor. Add the code to the Cell's IOMMU
implementation to use this code.

Dynamic mappings can be weakly or strongly ordered on an individual basis
but the fixed mapping has to be either completely strong or completely weak.
This is currently decided by a kernel boot option (pass iommu_fixed=weak
for a weakly ordered fixed linear mapping, strongly ordered is the default).

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-22 10:39:36 +10:00
Michael Ellerman
551c3c04b4 powerpc: Use PPC_LONG_ALIGN in uaccess.h
Use the new PPC_LONG_ALIGN macro instead of passing an argument
to the asm for consistency.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-22 10:39:35 +10:00
Michael Ellerman
6a2a24bb75 powerpc: Add a #define for aligning to a long-sized boundary
Add a #define for aligning to a long-sized boundary. It would be nice
to use sizeof(long) for this, but that requires generating the value
with asm-offsets.c, and asm-offsets.c includes asm-compat.h and we
descend into some sort of recursive include hell.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-22 10:39:34 +10:00
Masakazu Mokuno
059e4938f8 powerpc/ps3: Add a sub-match id to ps3_system_bus
Add sub match id for ps3 system bus so that two different system bus
devices can be connected to a shared device.

Signed-off-by: Masakazu Mokuno <mokuno@sm.sony.co.jp>
Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-22 10:39:33 +10:00
Mark Nelson
4f3dd8a062 powerpc/dma: Use the struct dma_attrs in iommu code
Update iommu_alloc() to take the struct dma_attrs and pass them on to
tce_build(). This change propagates down to the tce_build functions of
all the platforms.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-22 10:39:32 +10:00
Christian Krafft
4795b7801b powerpc/cell: Add support for power button of future IBM cell blades
This patch adds support for the power button on future IBM cell blades.
It actually doesn't shut down the machine. Instead it exposes an
input device /dev/input/event0 to userspace which sends KEY_POWER
if power button has been pressed.
haldaemon actually recognizes the button, so a plattform independent acpid
replacement should handle it correctly.

Signed-off-by: Christian Krafft <krafft@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-22 10:39:32 +10:00
Wolfgang Grandegger
18ad7a61e1 of_gpio: Should use new <linux/gpio.h> header
Since commit 7560fa60fc (gpio: <linux/gpio.h>
and "no GPIO support here" stubs) drivers can use GPIOs if they're available,
but don't require them.

This patch actually enables this feature.

Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-22 10:39:30 +10:00
Bartlomiej Zolnierkiewicz
6662327e19 ide: merge <asm-sparc/ide_64.h> with <asm-sparc/ide_32.h>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 16:55:11 -07:00
Bartlomiej Zolnierkiewicz
edc83d4f3e ide: <asm-sparc/ide_64.h>: use __raw_{read,write}w()
Use __raw_{read,write}w() in __ide_{in,out}sw()
and remove no longer needed {in,out}w_be().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 16:51:43 -07:00
Bartlomiej Zolnierkiewicz
8fbf3f30fe ide: <asm-sparc/ide_32.h>: use __raw_{read,write}w()
Use __raw_{read,write}w() in __ide_{in,out}sw().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 16:50:56 -07:00
Bartlomiej Zolnierkiewicz
28c10af712 ide: <asm-sparc/ide_64.h>: use %r0 for outw_be()
Use %r0 for outw_be() to make it match __raw_writew().

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 16:49:34 -07:00
Linus Torvalds
93ded9b8fd Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (100 commits)
  usb-storage: revert DMA-alignment change for Wireless USB
  USB: use reset_resume when normal resume fails
  usb_gadget: composite cdc gadget fault handling
  usb gadget: minor USBCV fix for composite framework
  USB: Fix bug with byte order in isp116x-hcd.c fio write/read
  USB: fix double kfree in ipaq in error case
  USB: fix build error in cdc-acm for CONFIG_PM=n
  USB: remove board-specific UP2OCR configuration from pxa27x-udc
  USB: EHCI: Reconciling USB register differences on MPC85xx vs MPC83xx
  USB: Fix pointer/int cast in USB devio code
  usb gadget: g_cdc dependso on NET
  USB: Au1xxx-usb: suspend/resume support.
  USB: Au1xxx-usb: clean up ohci/ehci bus glue sources.
  usbfs: don't store bad pointers in registration
  usbfs: fix race between open and unregister
  usbfs: simplify the lookup-by-minor routines
  usbfs: send disconnect signals when device is unregistered
  USB: Force unbinding of drivers lacking reset_resume or other methods
  USB: ohci-pnx4008: I2C cleanups and fixes
  USB: debug port converter does not accept more than 8 byte packets
  ...
2008-07-21 15:42:53 -07:00
Alan Stern
78d9a487ee USB: Force unbinding of drivers lacking reset_resume or other methods
This patch (as1024) takes care of a FIXME issue: Drivers that don't
have the necessary suspend, resume, reset_resume, pre_reset, or
post_reset methods will be unbound and their interface reprobed when
one of the unsupported events occurs.

This is made slightly more difficult by the fact that bind operations
won't work during a system sleep transition.  So instead the code has
to defer the operation until the transition ends.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 15:16:40 -07:00
Ming Lei
742120c631 USB: fix usb_reset_device and usb_reset_composite_device(take 3)
This patch renames the existing usb_reset_device in hub.c to
usb_reset_and_verify_device and renames the existing
usb_reset_composite_device to usb_reset_device. Also the new
usb_reset_and_verify_device does't need to be EXPORTED .

The idea of the patch is that external interface driver
should warn the other interfaces' driver of the same
device before and after reseting the usb device. One interface
driver shoud call _old_ usb_reset_composite_device instead of
_old_ usb_reset_device since it can't assume the device contains
only one interface. The _old_ usb_reset_composite_device
is safe for single interface device also. we rename the two
functions to make the change easily.

This patch is under guideline from Alan Stern.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
2008-07-21 15:16:33 -07:00
Ming Lei
625f694936 USB: remove interface parameter of usb_reset_composite_device
From the current implementation of usb_reset_composite_device
function, the iface parameter is no longer useful. This function
doesn't do something special for the iface usb_interface,compared
with other interfaces in the usb_device. So remove the parameter
and fix the related caller.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 15:16:32 -07:00
Alan Stern
f579c2b46f USB Gadget: documentation update
This patch (as1102) clarifies two points in the USB Gadget kerneldoc:

	Request completion callbacks are always made with interrupts
	disabled;

	Device controllers may not support STALLing the status stage
	of a control transfer after the data stage is over.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 15:16:27 -07:00
Felipe Balbi
e0d795e4f3 usb: irda: cleanup on ir-usb module
General cleanup on ir-usb module. Introduced
a common header that could be used also on
usb gadget framework.

Lot's of cleanups and now using macros from the header
file.

Signed-off-by: Felipe Balbi <me@felipebalbi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 15:16:27 -07:00
David Brownell
40982be52d usb gadget: composite gadget core
Add <linux/usb/composite.h> interfaces for composite gadget drivers, and
basic implementation support behind it:

  - struct usb_function ... groups one or more interfaces into a function
    managed as one unit within a configuration, to which it's added by
    usb_add_function().

  - struct usb_configuration ... groups one or more such functions into
    a configuration managed as one unit by a driver, to which it's added
    by usb_add_config().  These operate at either high or full/low speeds
    and at a given bMaxPower.

  - struct usb_composite_driver ... groups one or more such configurations
    into a gadget driver, which may be registered or unregistered.

  - struct usb_composite_dev ... a usb_composite_driver manages this; it
    wraps the usb_gadget exposed by the controller driver.

This also includes some basic kerneldoc.

How to use it (the short version):  provide a usb_composite_driver with a
bind() that calls usb_add_config() for each of the needed configurations.
The configurations in turn have bind() calls, which will usb_add_function()
for each function required.  Each function's bind() allocates resources
needed to perform its tasks, like endpoints; sometimes configurations will
allocate resources too.

Separate patches will convert most gadget drivers to this infrastructure.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 15:16:01 -07:00
David Brownell
a4c39c41bf usb gadget: descriptor copying support
Define three new descriptor manipulation utilities, for use when
setting up functions that may have multiple instances:

	usb_copy_descriptors() to copy a vector of descriptors
	usb_free_descriptors() to free the copy
	usb_find_endpoint() to find a copied version

These will be used as follows.  Functions will continue to have static
tables of descriptors they update, now used as __initdata templates.

When a function creates a new instance, it patches those tables with
relevant interface and string IDs, plus endpoint assignments.  Then it
copies those morphed descriptors, associates the copies with the new
function instance, and records the endpoint descriptors to use when
activating the endpoints.  When initialization is done, only the copies
remain in memory.  The copies are freed on driver removal.

This ensures that each instance has descriptors which hold the right
instance-specific data.  Two instances in the same configuration will
obviously never share the same interface IDs or use the same endpoints.
Instances in different configurations won't do so either, which means
this is slightly less memory-efficient in some cases.

This also includes a bugfix to the epautoconf code that shows up with
this usage model.  It must replace the previous endpoint number when
updating the template descriptors, not just mask in a few more bits.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 15:16:00 -07:00
Adrian Bunk
ea05af61a8 USB: remove CVS keywords
This patch removes CVS keywords that weren't updated for a long time
from comments.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 15:15:55 -07:00
Alan Stern
9da82bd464 USB: implement "soft" unbinding
This patch (as1091) changes the way usbcore handles interface
unbinding.  If the interface's driver supports "soft" unbinding (a new
flag in the driver structure) then in-flight URBs are not cancelled
and endpoints are not disabled.  Instead the driver is allowed to
continue communicating with the device (although of course it should
stop before its disconnect routine returns).

The purpose of this change is to allow drivers to do a clean shutdown
when they get unbound from a device that is still plugged in.  Killing
all the URBs and disabling the endpoints before calling the driver's
disconnect method doesn't give the driver any control over what
happens, and it can leave devices in indeterminate states.  For
example, when usb-storage unbinds it doesn't want to stop while in the
middle of transmitting a SCSI command.

The soft_unbind flag is added because in the past, a number of drivers
have experienced problems related to ongoing I/O after their disconnect
routine returned.  Hence "soft" unbinding is made available only to
drivers that claim to support it.

The patch also replaces "interface_to_usbdev(intf)" with "udev" in a
couple of places, a minor simplification.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 15:15:54 -07:00
Greg Kroah-Hartman
1b26da1510 USB: handle pci_name() being const
This changes usb_create_hcd() to be able to handle the fact that
pci_name() has changed to a constant string.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-07-21 15:15:46 -07:00
Linus Torvalds
6d52dcbe56 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] cpufreq: remove CVS keywords
  [CPUFREQ] change cpu freq arrays to per_cpu variables
2008-07-21 15:10:37 -07:00
Linus Torvalds
eb4225b2da Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6: (25 commits)
  mmtimer: Push BKL down into the ioctl handler
  [IA64] Remove experimental status of kdump
  [IA64] Update ia64 mmr list for SGI uv
  [IA64] Avoid overflowing ia64_cpu_to_sapicid in acpi_map_lsapic()
  [IA64] adding parameter check to module_free()
  [IA64] improper printk format in acpi-cpufreq
  [IA64] pv_ops: move some functions in ivt.S to avoid lack of space.
  [IA64] pvops: documentation on ia64/pv_ops
  [IA64] pvops: add to hooks, pv_time_ops, for steal time accounting.
  [IA64] pvops: add hooks, pv_irq_ops, to paravirtualized irq related operations.
  [IA64] pvops: add hooks, pv_iosapic_ops, to paravirtualize iosapic.
  [IA64] pvops: define initialization hooks, pv_init_ops, for paravirtualized environment.
  [IA64] pvops: paravirtualize NR_IRQS
  [IA64] pvops: paravirtualize entry.S
  [IA64] pvops: paravirtualize ivt.S
  [IA64] pvops: paravirtualize minstate.h.
  [IA64] pvops: define paravirtualized instructions for native.
  [IA64] pvops: preparation for paravirtulization of hand written assembly code.
  [IA64] pvops: introduce pv_cpu_ops to paravirtualize privileged instructions.
  [IA64] pvops: add an early setup hook for pv_ops.
  ...
2008-07-21 14:55:23 -07:00
David S. Miller
ebb36a9781 ipv6: __KERNEL__ ifdef struct ipv6_devconf
Based upon a report by Olaf Hering.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 13:41:16 -07:00
Arjan van de Ven
6579e57b31 net: Print the module name as part of the watchdog message
As suggested by Dave:

This patch adds a function to get the driver name from a struct net_device,
and consequently uses this in the watchdog timeout handler to print as 
part of the message. 

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 13:31:48 -07:00
David S. Miller
74988bd85d sparc64: Do not define BIO_VMERGE_BOUNDARY.
The IOMMU code and the block layer can split things
up using different rules, so this can't work reliably.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 13:17:38 -07:00
Linus Torvalds
3488007afc Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: convert Dprintk to pr_debug
2008-07-21 13:02:00 -07:00
Thomas Gleixner
cfc1b9a6a6 x86: convert Dprintk to pr_debug
There are a couple of places where (P)Dprintk is used which is an old
compile time enabled printk wrapper. Convert it to the generic
pr_debug().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-21 21:35:38 +02:00
Linus Torvalds
e89970aa93 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  netfilter: nf_conntrack_sctp: fix sparse warnings
  netfilter: nf_nat_sip: c= is optional for session
  netfilter: xt_TCPMSS: collapse tcpmss_reverse_mtu{4,6} into one function
  netfilter: nfnetlink_log: send complete hardware header
  netfilter: xt_time: fix time's time_mt()'s use of do_div()
  netfilter: accounting rework: ct_extend + 64bit counters (v4)
  netlink: add NLA_PUT_BE64 macro
  netfilter: nf_nat_core: eliminate useless find_appropriate_src for IP_NAT_RANGE_PROTO_RANDOM
  hdlcdrv: Fix CRC calculation.
  Revert "pkt_sched: Make default qdisc nonshared-multiqueue safe."
  net: In __netif_schedule() use WARN_ON instead of BUG_ON
  net: Improve simple_tx_hash().
  pkt_sched: Remove unused variable skb in dev_deactivate_queue function.
  sunhme: Remove stop/wake TX queue calls in set-multicast-list handler.
  ucc_geth: do not touch net queue in adjust_link phylib callback
  gianfar: do not touch net queue in adjust_link phylib callback
  atl1: Do not wake queue before queue has been started.
2008-07-21 11:29:52 -07:00
Linus Torvalds
72a73693aa Merge branch 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (160 commits)
  x86: remove extra calling to get ext cpuid level
  x86: use setup_clear_cpu_cap() when disabling the lapic
  KVM: fix exception entry / build bug, on 64-bit
  x86: add unknown_nmi_panic kernel parameter
  x86, VisWS: turn into generic arch, eliminate leftover files
  x86: add ->pre_time_init to x86_quirks
  x86: extend and use x86_quirks to clean up NUMAQ code
  x86: introduce x86_quirks
  x86: improve debug printout: add target bootmem range in early_res_to_bootmem()
  Subject: devmem, x86: fix rename of CONFIG_NONPROMISC_DEVMEM
  x86: remove arch_get_ram_range
  x86: Add a debugfs interface to dump PAT memtype
  x86: Add a arch directory for x86 under debugfs
  x86: i386: reduce boot fixmap space
  i386/xen: add proper unwind annotations to xen_sysenter_target
  x86: reduce force_mwait visibility
  x86: reduce forbid_dac's visibility
  x86: fix two modpost warnings
  x86: check function status in EDD boot code
  x86_64: ia32_signal.c: remove signal number conversion
  ...
2008-07-21 10:34:25 -07:00
Linus Torvalds
b7e6f62fe2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
  dm crypt: add merge
  dm table: remove merge_bvec sector restriction
  dm: linear add merge
  dm: introduce merge_bvec_fn
  dm snapshot: use per device mempools
  dm snapshot: fix race during exception creation
  dm snapshot: track snapshot reads
  dm mpath: fix test for reinstate_path
  dm mpath: return parameter error
  dm io: remove struct padding
  dm log: make dm_dirty_log init and exit static
  dm mpath: free path selector on invalid args
2008-07-21 10:30:10 -07:00
Linus Torvalds
8a392625b6 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md: (52 commits)
  md: Protect access to mddev->disks list using RCU
  md: only count actual openers as access which prevent a 'stop'
  md: linear: Make array_size sector-based and rename it to array_sectors.
  md: Make mddev->array_size sector-based.
  md: Make super_type->rdev_size_change() take sector-based sizes.
  md: Fix check for overlapping devices.
  md: Tidy up rdev_size_store a bit:
  md: Remove some unused macros.
  md: Turn rdev->sb_offset into a sector-based quantity.
  md: Make calc_dev_sboffset() return a sector count.
  md: Replace calc_dev_size() by calc_num_sectors().
  md: Make update_size() take the number of sectors.
  md: Better control of when do_md_stop is allowed to stop the array.
  md: get_disk_info(): Don't convert between signed and unsigned and back.
  md: Simplify restart_array().
  md: alloc_disk_sb(): Return proper error value.
  md: Simplify sb_equal().
  md: Simplify uuid_equal().
  md: sb_equal(): Fix misleading printk.
  md: Fix a typo in the comment to cmd_match().
  ...
2008-07-21 10:29:12 -07:00
Linus Torvalds
519f0141f1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (48 commits)
  Input: add switch for dock events
  Input: add microphone insert switch definition
  Input: i8042 - add Arima-Rioworks HDAMB board to noloop list
  Input: sgi_btns - add support for SGI Indy volume buttons
  Input: add option to disable HP SDC driver
  Input: serio - trivial documentation fix
  Input: add new serio driver for Xilinx XPS PS2 IP
  Input: add driver for Tabletkiosk Sahara TouchIT-213 touchscreen
  Input: new driver for SGI O2 volume buttons
  Input: yealink - reliably kill urbs
  Input: q40kbd - make q40kbd_lock static
  Input: gtco - eliminate early return
  Input: i8042 - add Dritek quirk for Acer Aspire 5720
  Input: usbtouchscreen - ignore eGalax screens supporting HID protocol
  Input: i8042 - add Medion NAM 2070 to noloop blacklist
  Input: i8042 - add Gericom Bellagio to nomux blacklist
  Input: i8042 - add Acer Aspire 1360 to nomux blacklist
  Input: hp_sdc_mlc.c - make a struct static
  Input: hil_mlc.c - make code static
  Input: wistron - generate normal key event if bluetooth or wifi not present
  ...
2008-07-21 10:27:31 -07:00
Eric Leblond
72961ecf84 netfilter: nfnetlink_log: send complete hardware header
This patch adds some fields to NFLOG to be able to send the complete
hardware header with all necessary informations.
It sends to userspace:
 * the type of hardware link
 * the lenght of hardware header
 * the hardware header

Signed-off-by: Eric Leblond <eric@inl.fr>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 10:11:00 -07:00
Krzysztof Piotr Oledzki
584015727a netfilter: accounting rework: ct_extend + 64bit counters (v4)
Initially netfilter has had 64bit counters for conntrack-based accounting, but
it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are
still required, for example for "connbytes" extension. However, 64bit counters
waste a lot of memory and it was not possible to enable/disable it runtime.

This patch:
 - reimplements accounting with respect to the extension infrastructure,
 - makes one global version of seq_print_acct() instead of two seq_print_counters(),
 - makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n),
 - makes it possible to enable/disable it at runtime by sysctl or sysfs,
 - extends counters from 32bit to 64bit,
 - renames ip_conntrack_counter -> nf_conn_counter,
 - enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT),
 - set initial accounting enable state based on CONFIG_NF_CT_ACCT
 - removes buggy IPCT_COUNTER_FILLING event handling.

If accounting is enabled newly created connections get additional acct extend.
Old connections are not changed as it is not possible to add a ct_extend area
to confirmed conntrack. Accounting is performed for all connections with
acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct".

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 10:10:58 -07:00
Krzysztof Piotr Oledzki
07a7c1070e netlink: add NLA_PUT_BE64 macro
Add NLA_PUT_BE64 macro required for 64bit counters in netfilter

Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-21 10:10:58 -07:00
Linus Torvalds
f8b71a3a92 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: (44 commits)
  sparc: Remove Sparc's asm-offsets for sclow.S
  sparc64: Update defconfig.
  sparc64: Add Niagara2 RNG driver.
  sparc64: Add missing hypervisor service group numbers.
  sparc64: Remove 4MB and 512K base page size options.
  sparc64: Convert to generic helpers for IPI function calls.
  sparc: Use new '%pS' infrastructure to print symbols.
  sparc32: fix init.c allnoconfig build error
  sparc64: Config category "Processor type and features" absent
  sparc: arch/sparc/kernel/apc.c to unlocked_ioctl
  sparc: join the remaining header files
  sparc: merge header files with trivial differences
  sparc: when header files are equal use asm-sparc version
  sparc: copy sparc64 specific files to asm-sparc
  sparc: Merge asm-sparc{,64}/asi.h
  sparc: export openprom.h to userspace
  sparc: Merge asm-sparc{,64}/types.h
  sparc: Merge asm-sparc{,64}/termios.h
  sparc: Merge asm-sparc{,64}/termbits.h
  sparc: Merge asm-sparc{,64}/setup.h
  ...
2008-07-21 09:40:26 -07:00
Ingo Molnar
eb6a12c242 Merge branch 'linus' into cpus4096-for-linus
Conflicts:

	net/sunrpc/svc.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-21 17:19:50 +02:00
Ingo Molnar
2e2dcc7631 Merge branch 'x86/paravirt-spinlocks' into x86/for-linus 2008-07-21 16:45:56 +02:00
Ingo Molnar
acee709cab Merge branches 'x86/urgent', 'x86/amd-iommu', 'x86/apic', 'x86/cleanups', 'x86/core', 'x86/cpu', 'x86/fixmap', 'x86/gart', 'x86/kprobes', 'x86/memtest', 'x86/modules', 'x86/nmi', 'x86/pat', 'x86/reboot', 'x86/setup', 'x86/step', 'x86/unify-pci', 'x86/uv', 'x86/xen' and 'xen-64bit' into x86/for-linus 2008-07-21 16:37:17 +02:00
Ingo Molnar
e66d90fb4a Merge branch 'linus' into xen-64bit 2008-07-21 15:06:09 +02:00
Ingo Molnar
1c29dd9a9e Merge branch 'linus' into x86/paravirt-spinlocks 2008-07-21 15:05:58 +02:00
Milan Broz
f6fccb1213 dm: introduce merge_bvec_fn
Introduce a bvec merge function for device mapper devices
for dynamic size restrictions.

This code ensures the requested biovec lies within a single
target and then calls a target-specific function to check
against any constraints imposed by underlying devices.

Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2008-07-21 12:00:37 +01:00
Ingo Molnar
33a37eb411 KVM: fix exception entry / build bug, on 64-bit
-tip testing found this build bug:

 arch/x86/kvm/built-in.o:(.text.fixup+0x1): relocation truncated to fit: R_X86_64_32 against `.text'
 arch/x86/kvm/built-in.o:(.text.fixup+0xb): relocation truncated to fit: R_X86_64_32 against `.text'
 arch/x86/kvm/built-in.o:(.text.fixup+0x15): relocation truncated to fit: R_X86_64_32 against `.text'
 arch/x86/kvm/built-in.o:(.text.fixup+0x1f): relocation truncated to fit: R_X86_64_32 against `.text'
 arch/x86/kvm/built-in.o:(.text.fixup+0x29): relocation truncated to fit: R_X86_64_32 against `.text'

Introduced by commit 4ecac3fd. The problem is that 'push' will default
to 32-bit, which is not wide enough as a fixup address. (and which would
crash on any real fixup event even if it was wide enough)

Introduce KVM_EX_PUSH to get the proper address push width on 64-bit too.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-21 11:03:32 +02:00
Ingo Molnar
e27772b48d Merge branch 'linus' into x86/urgent 2008-07-21 11:02:45 +02:00
NeilBrown
4b80991c6c md: Protect access to mddev->disks list using RCU
All modifications and most access to the mddev->disks list are made
under the reconfig_mutex lock.  However there are three places where
the list is walked without any locking.  If a reconfig happens at this
time, havoc (and oops) can ensue.

So use RCU to protect these accesses:
  - wrap them in rcu_read_{,un}lock()
  - use list_for_each_entry_rcu
  - add to the list with list_add_rcu
  - delete from the list with list_del_rcu
  - delay the 'free' with call_rcu rather than schedule_work

Note that export_rdev did a list_del_init on this list.  In almost all
cases the entry was not in the list anymore so it was a no-op and so
safe.  It is no longer safe as after list_del_rcu we may not touch
the list_head.
An audit shows that export_rdev is called:
  - after unbind_rdev_from_array, in which case the delete has
     already been done,
  - after bind_rdev_to_array fails, in which case the delete isn't needed.
  - before the device has been put on a list at all (e.g. in
      add_new_disk where reading the superblock fails).
  - and in autorun devices after a failure when the device is on a
      different list.

So remove the list_del_init call from export_rdev, and add it back
immediately before the called to export_rdev for that last case.

Note also that ->same_set is sometimes used for lists other than
mddev->list (e.g. candidates).  In these cases rcu is not needed.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-07-21 17:05:25 +10:00
NeilBrown
f2ea68cf42 md: only count actual openers as access which prevent a 'stop'
Open isn't the only thing that increments ->active.  e.g. reading
/proc/mdstat will increment it briefly.  So to avoid false positives
in testing for concurrent access, introduce a new counter that counts
just the number of times the md device it open.

Signed-off-by: NeilBrown <neilb@suse.de>
2008-07-21 17:05:25 +10:00
Andre Noll
d6e2215052 md: linear: Make array_size sector-based and rename it to array_sectors.
Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2008-07-21 17:05:25 +10:00
Andre Noll
f233ea5c9e md: Make mddev->array_size sector-based.
This patch renames the array_size field of struct mddev_s to array_sectors
and converts all instances to use units of 512 byte sectors instead of 1k
blocks.

Signed-off-by: Andre Noll <maan@systemlinux.org>
Signed-off-by: NeilBrown <neilb@suse.de>
2008-07-21 17:05:22 +10:00
Dmitry Torokhov
908cf4b925 Merge master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 into next 2008-07-21 00:55:14 -04:00
Linus Torvalds
14b395e35d Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux: (51 commits)
  nfsd: nfs4xdr.c do-while is not a compound statement
  nfsd: Use C99 initializers in fs/nfsd/nfs4xdr.c
  lockd: Pass "struct sockaddr *" to new failover-by-IP function
  lockd: get host reference in nlmsvc_create_block() instead of callers
  lockd: minor svclock.c style fixes
  lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock
  lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock
  lockd: nlm_release_host() checks for NULL, caller needn't
  file lock: reorder struct file_lock to save space on 64 bit builds
  nfsd: take file and mnt write in nfs4_upgrade_open
  nfsd: document open share bit tracking
  nfsd: tabulate nfs4 xdr encoding functions
  nfsd: dprint operation names
  svcrdma: Change WR context get/put to use the kmem cache
  svcrdma: Create a kmem cache for the WR contexts
  svcrdma: Add flush_scheduled_work to module exit function
  svcrdma: Limit ORD based on client's advertised IRD
  svcrdma: Remove unused wait q from svcrdma_xprt structure
  svcrdma: Remove unneeded spin locks from __svc_rdma_free
  svcrdma: Add dma map count and WARN_ON
  ...
2008-07-20 21:21:46 -07:00
Linus Torvalds
d1671a9c15 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  pkt_sched: Fix build with NET_SCHED disabled.
2008-07-20 21:17:20 -07:00
Linus Torvalds
ae0645a451 Merge branch 'for-linus' of git://git.o-hand.com/linux-mfd
* 'for-linus' of git://git.o-hand.com/linux-mfd:
  mfd: let asic3 use mem resource instead of bus_shift
  mfd: remove DS1WM register definitions from asic3.h
  mfd: add ASIC3_CONFIG_GPIO templates
  mfd: fix the asic3 irq demux code
  mfd: asic3 should depend on gpiolib
  mfd: fix asic3 config array initialisation
  mfd: move asic3 probe functions into __init section
  mfd: Use uppercase only for asic3 macros and defines
  mfd: use dev_* macros for asic3 debugging
  mfd: New asic3 gpio configuration code
  mfd: asic3 children platform data removal
  mfd: asic3 gpiolib support
2008-07-20 21:16:27 -07:00
Linus Torvalds
f894d18380 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (277 commits)
  V4L/DVB (8415): gspca: Infinite loop in i2c_w() of etoms.
  V4L/DVB (8414): videodev/cx18: fix get_index bug and error-handling lock-ups
  V4L/DVB (8411): videobuf-dma-contig.c: fix 64-bit build for pre-2.6.24 kernels
  V4L/DVB (8410): sh_mobile_ceu_camera: fix 64-bit compiler warnings
  V4L/DVB (8397): video: convert select VIDEO_ZORAN_ZR36060 into depends on
  V4L/DVB (8396): video: Fix Kbuild dependency for VIDEO_IR_I2C
  V4L/DVB (8395): saa7134: Fix Kbuild dependency of ir-kbd-i2c
  V4L/DVB (8394): ir-common: CodingStyle fix: move EXPORT_SYMBOL_GPL to their proper places
  V4L/DVB (8393): media/video: Fix depencencies for VIDEOBUF
  V4L/DVB (8392): media/Kconfig: Convert V4L1_COMPAT select into "depends on"
  V4L/DVB (8390): videodev: add comment and remove magic number.
  V4L/DVB (8389): videodev: simplify get_index()
  V4L/DVB (8387): Some cosmetic changes
  V4L/DVB (8381): ov7670: fix compile warnings
  V4L/DVB (8380): saa7115: use saa7115_auto instead of saa711x as the autodetect driver name.
  V4L/DVB (8379): saa7127: Make device detection optional
  V4L/DVB (8378): cx18: move cx18_av_vbi_setup to av-core.c and rename to cx18_av_std_setup
  V4L/DVB (8377): ivtv/cx18: ensure the default control values are correct
  V4L/DVB (8376): cx25840: move cx25840_vbi_setup to core.c and rename to cx25840_std_setup
  V4L/DVB (8374): gspca: No conflict of 0c45:6011 with the sn9c102 driver.
  ...
2008-07-20 21:14:42 -07:00
Linus Torvalds
d13ff0559f Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (21 commits)
  [MIPS] Remove unused maltasmp.h.
  [MIPS] Remove unused saa9730_uart.h.
  [MIPS] Rename MIPS sys_pipe syscall entry point to something MIPS-specific.
  [MIPS] 32-bit compat: Delete unused sys_truncate64 and sys_ftruncate64.
  [MIPS] TXx9: Fix some sparse warnings
  [MIPS] TXx9: Add 64-bit support
  [MIPS] TXx9: Cleanups for 64-bit support
  [MIPS] Cobalt: Fix I/O port resource range
  [MIPS] don't leak setup_early_printk() in userspace header
  [MIPS] Remove include/asm-mips/mips-boards/sead{,int}.h
  [MIPS] Remove asm-mips/mips-boards/atlas{,int}.h
  [MIPS] mips/sgi-ip22/ip28-berr.c: fix the build
  [MIPS] TXx9: Miscellaneous build fixes
  [MIPS] Routerboard 532: Support for base system
  [MIPS] IP32: Use common SGI button driver
  [MIPS] IP22: Use common SGI button driver
  [MIPS] IP22, IP28: Fix merge bug
  [MIPS] Tinker with constraints in <asm/atomic.h> to fix build error.
  [MIPS] Add missing prototypes to asm/page.h
  [MIPS] Fix missing prototypes in asm/fpu.h
  ...
2008-07-20 21:14:00 -07:00
Linus Torvalds
f076ab8d04 Merge branch 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
* 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (70 commits)
  KVM: Adjust smp_call_function_mask() callers to new requirements
  KVM: MMU: Fix potential race setting upper shadow ptes on nonpae hosts
  KVM: x86 emulator: emulate clflush
  KVM: MMU: improve invalid shadow root page handling
  KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction
  KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues
  KVM: check injected pic irq within valid pic irqs
  KVM: x86 emulator: Fix HLT instruction
  KVM: Apply the kernel sigmask to vcpus blocked due to being uninitialized
  KVM: VMX: Add ept_sync_context in flush_tlb
  KVM: mmu_shrink: kvm_mmu_zap_page requires slots_lock to be held
  x86: KVM guest: make kvm_smp_prepare_boot_cpu() static
  KVM: SVM: fix suspend/resume support
  KVM: s390: rename private structures
  KVM: s390: Set guest storage limit and offset to sane values
  KVM: Fix memory leak on guest exit
  KVM: s390: dont allocate dirty bitmap
  KVM: move slots_lock acquision down to vapic_exit
  KVM: VMX: Fake emulate Intel perfctr MSRs
  KVM: VMX: Fix a wrong usage of vmcs_config
  ...
2008-07-20 21:13:26 -07:00
David S. Miller
3a682fbd73 pkt_sched: Fix build with NET_SCHED disabled.
The stab bits can't be referenced uniless the full
packet scheduler layer is enabled.

Reported by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 18:13:01 -07:00
Linus Torvalds
db6d8c7a40 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (1232 commits)
  iucv: Fix bad merging.
  net_sched: Add size table for qdiscs
  net_sched: Add accessor function for packet length for qdiscs
  net_sched: Add qdisc_enqueue wrapper
  highmem: Export totalhigh_pages.
  ipv6 mcast: Omit redundant address family checks in ip6_mc_source().
  net: Use standard structures for generic socket address structures.
  ipv6 netns: Make several "global" sysctl variables namespace aware.
  netns: Use net_eq() to compare net-namespaces for optimization.
  ipv6: remove unused macros from net/ipv6.h
  ipv6: remove unused parameter from ip6_ra_control
  tcp: fix kernel panic with listening_get_next
  tcp: Remove redundant checks when setting eff_sacks
  tcp: options clean up
  tcp: Fix MD5 signatures for non-linear skbs
  sctp: Update sctp global memory limit allocations.
  sctp: remove unnecessary byteshifting, calculate directly in big-endian
  sctp: Allow only 1 listening socket with SO_REUSEADDR
  sctp: Do not leak memory on multiple listen() calls
  sctp: Support ipv6only AF_INET6 sockets.
  ...
2008-07-20 17:43:29 -07:00
Linus Torvalds
3a53337428 Merge branch 'for-linus' of git://www.jni.nu/cris
* 'for-linus' of git://www.jni.nu/cris:
  [CRISv10] Clean up compressed/misc.c
  [CRISv10] Correct whitespace damage.
  [CRIS] Correct definition of subdirs for install_headers.
  [CRIS] Correct image makefiles to allow using a separate OBJ-directory.
  [CRIS] Build fixes for compressed and rescue images for v10 and v32:
  It looks at least odd to apply spin_unlock to a mutex.
  cris: compile fixes for 2.6.26-rc5
2008-07-20 17:37:46 -07:00
Adrian Bunk
a594409a21 m68k: remove stale ARCH_SUN4 #define
m68k: remove stale ARCH_SUN4 #define

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:24:39 -07:00
Adrian Bunk
07b8125949 m68k/sun3/: possible cleanups
This patch contains the following possible cleanups:
- make the following needlessly global code static:
  - config.c: sun3_bootmem_alloc()
  - config.c: sun3_sched_init()
  - dvma.c: dvma_page()
  - idprom.c: struct Sun_Machines[]
  - mmu_emu.c: struct ctx_alloc[]
  - sun3dvma.c: iommu_use[]
  - sun3ints.c: led_pattern[]
- remove the unused sbus.c

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:24:39 -07:00
Adrian Bunk
8dfbdf4aba m68k/mac/: possible cleanups
This patch contains the following possible cleanups:
- make the following needlessly global code (always) static:
  - baboon.c: struct baboon
  - baboon.c: baboon_irq()
  - config.c: mac_orig_videoaddr
  - config.c: mac_identify()
  - config.c: mac_report_hardware()
  - config.c: mac_debug_console_write()
  - config.c: mac_sccb_console_write()
  - config.c: mac_scca_console_write()
  - config.c: mac_init_scc_port()
  - oss.c: oss_irq()
  - oss.c: oss_nubus_irq()
  - psc.c: psc_debug_dump()
  - psc.c: psc_dma_die_die_die()
  - via.c: rbv_clear
- remove the unused bootparse.c
- #if 0 the following unused functions:
  - config.c: mac_debugging_short()
  - config.c: mac_debugging_long()
- remove the following unused code:
  - config.c: mac_bisize
  - config.c: mac_env
  - config.c: mac_SCC_init_done
  - config.c: mac_SCC_reset_done
  - config.c: mac_init_scca_port()
  - config.c: mac_init_sccb_port()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:24:39 -07:00
Adrian Bunk
0795dbcc4c m68k/amiga/: possible cleanups
This patch contains the following possible cleanups:
- amiints.c: add a proper prototype for amiga_init_IRQ() in
             include/asm-m68k/amigaints.h
- make the following needlessly global code static:
  - config.c: amiga_model
  - config.c: amiga_psfreq
  - config.c: amiga_serial_console_write()
- #if 0 the following unused functions:
  - config.c: amiga_serial_puts()
  - config.c: amiga_serial_console_wait_key()
  - config.c: amiga_serial_gets()
- remove the following unused variable:
  - config.c: amiga_masterclock

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:24:39 -07:00
Adrian Bunk
d33b4432e6 m68k: remove AP1000 code
Unless I miss something that's code for a sparc machine even the sparc
code no longer supports that got copied to m68k when these files were
copied.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:24:38 -07:00
Geert Uytterhoeven
edfd92f67e m68k: Allow no CPU/platform type for allnoconfig
Allow no CPU/platform type for allnoconfig
  - Provide a dummy value for FPSTATESIZE if no CPU type was selected
  - Provide a dummy value for NR_IRQS if no platform type was selected
  - Warn the user if no CPU or platform type was selected

Note: you still cannot build an allnoconfig kernel, as CONFIG_SWAP=n doesn't
build and we cannot easily fix that
(http://groups.google.com/group/linux.kernel/browse_thread/thread/d430c78b07e1827b)

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:24:38 -07:00
Adrian Bunk
f30828a674 m68k: remove CVS keywords
This patch removes CVS keywords that weren't updated for a long time
from comments.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:24:38 -07:00
Linus Torvalds
f7df406dce Merge branch 'configfs-fixup-ptr-error' of git://oss.oracle.com/git/jlbec/linux-2.6
* 'configfs-fixup-ptr-error' of git://oss.oracle.com/git/jlbec/linux-2.6:
  configfs: Allow ->make_item() and ->make_group() to return detailed errors.
  Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors."
2008-07-20 17:17:52 -07:00
Alan Cox
44b7d1b37f tty: add more tty_port fields
Move more bits into the tty_port structure

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:38 -07:00
Alan Cox
593573bc55 termios: Termios defines for other platforms
Fix up the termios of the people who have not yet got with the program

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:38 -07:00
Alan Cox
77451e53e0 cyclades: use tty_port
Switch cyclades to use the new tty_port structure

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:38 -07:00
Alan Cox
f8ae476416 stallion: use tty_port
Switch the stallion driver to use the tty_port structure

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:37 -07:00
Alan Cox
b02f5ad6a3 istallion: use tty_port
Switch istallion to use the new tty_port structure

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:37 -07:00
Alan Cox
b5391e29f4 gs: use tty_port
Switch drivers using the old "generic serial" driver to use the tty_port
structures

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:36 -07:00
Alan Cox
4982d6b37a esp: use tty_port
Switch esp to use the new tty_port structures

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:36 -07:00
Alan Cox
7a4d29f426 tty.h: clean up
Coding style clean up and white space tidy

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:36 -07:00
Alan Cox
df4f4dd429 serial: use tty_port
Switch the serial_core based drivers to use the new tty_port structure.
We can't quite use all of it yet because of the dynamically allocated
extras in the serial_core layer.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:35 -07:00
Alan Cox
6f67048cd0 tty: Introduce a tty_port common structure
Every tty driver has its own concept of a port structure and because
they all differ we cannot extract commonality.  Begin fixing this by
creating a structure drivers can elect to use so that over time we can
push fields into this and create commonality and then introduce common
methods.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:35 -07:00
Alan Cox
a352def21a tty: Ldisc revamp
Move the line disciplines towards a conventional ->ops arrangement.  For
the moment the actual 'tty_ldisc' struct in the tty is kept as part of
the tty struct but this can then be changed if it turns out that when it
all settles down we want to refcount ldiscs separately to the tty.

Pull the ldisc code out of /proc and put it with our ldisc code.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:34 -07:00
Haavard Skinnemoen
6bb0e3a59a Subject: [PATCH 1/2] serial: Add flush_buffer() operation to uart_ops
Serial drivers using DMA (like the atmel_serial driver) tend to get very
confused when the xmit buffer is flushed and nobody told them.  They
also tend to spew a lot of garbage since the DMA engine keeps running
after the buffer is flushed and possibly refilled with unrelated data.

This patch adds a new flush_buffer operation to the uart_ops struct,
along with a call to it from uart_flush_buffer() right after the xmit
buffer has been cleared. The driver can implement this in order to
syncronize its internal DMA state with the xmit buffer when the buffer
is flushed.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-20 17:12:34 -07:00
Dmitry Baryshkov
8fa8b9fbab Cris: convert to using generic dma-coherent mem allocator
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 21:21:59 +02:00
Philipp Zabel
99cdb0c8c5 mfd: let asic3 use mem resource instead of bus_shift
The bus_shift parameter in platform_data is not needed
as we can tell the driver with the IOMEM_RESOURCE whether
the ASIC is located on a 16bit or 32bit memory bus.

The htc-egpio driver uses a more descriptive bus_width parameter,
but for drivers where the register map size fixed, we don't even
need this.

Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
2008-07-20 19:56:44 +02:00
Philipp Zabel
279cac484e mfd: remove DS1WM register definitions from asic3.h
There is a dedicated ds1wm driver, no need to duplicate this
information here.

Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
2008-07-20 19:56:24 +02:00
Philipp Zabel
4a67b528e0 mfd: add ASIC3_CONFIG_GPIO templates
As ASIC3 GPIO alternate function configuration is expected to be similar
for several devices, it is convenient to define descriptive macros. This
patch is inspired by the PXA MFP configuration, the alternate functions
were observed on hx4700 and blueangel.

Signed-off-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
2008-07-20 19:56:19 +02:00
Samuel Ortiz
3b8139f8b1 mfd: Use uppercase only for asic3 macros and defines
Let's be consistent and use uppercase only, for both macro and defines.

Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-07-20 19:55:14 +02:00
Samuel Ortiz
3b26bf1722 mfd: New asic3 gpio configuration code
The ASIC3 GPIO configuration code is a bit obscure and hardly readable.
This patch changes it so that it is now more readable and understandable,
by being more explicit.

Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-07-20 19:55:00 +02:00
Samuel Ortiz
1effe5bc6c mfd: asic3 children platform data removal
Platform devices should be dynamically allocated, and each supported
device should have its own platform data.
For now we just remove this buggy code.

Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-07-20 19:54:51 +02:00
Samuel Ortiz
6f2384c4bd mfd: asic3 gpiolib support
ASIC3 is, among other things, a GPIO extender. We should thus have it
supporting the current gpiolib API.

Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-07-20 19:52:38 +02:00
Yoichi Yuasa
cb7f39d2bc [MIPS] Remove unused maltasmp.h.
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:22 +01:00
Yoichi Yuasa
3d739f2daa [MIPS] Remove unused saa9730_uart.h.
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:22 +01:00
Atsushi Nemoto
e0eb730757 [MIPS] TXx9: Fix some sparse warnings
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:21 +01:00
Atsushi Nemoto
94a4c32939 [MIPS] TXx9: Add 64-bit support
SYS_SUPPORTS_64BIT_KERNEL is enabled for RBTX4927/RBTX4938, but
actually it was broken for long time (or from the beginning).  Now it
should work.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:21 +01:00
Atsushi Nemoto
255033a9bb [MIPS] TXx9: Cleanups for 64-bit support
* Unify (and fix) mem_tx4938.c and mem_tx4927.c
* Simplify prom_init
* Kill volatiles and unused definitions for tx4927.h and tx4938.h

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:21 +01:00
Adrian Bunk
5f15d37876 [MIPS] don't leak setup_early_printk() in userspace header
Our userspace headers shouldn't contain prototypes of in-kernel
functions.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:19 +01:00
Adrian Bunk
f113c5eda2 [MIPS] Remove include/asm-mips/mips-boards/sead{,int}.h
include/asm-mips/mips-boards/sead{,int}.h are now obsolete.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:19 +01:00
Adrian Bunk
e76812eddc [MIPS] Remove asm-mips/mips-boards/atlas{,int}.h
asm-mips/mips-boards/atlas{,int}.h are now obsolete.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:19 +01:00
Ralf Baechle
73b4390fb2 [MIPS] Routerboard 532: Support for base system
Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Signed-off-by: Florian Fainelli <florian.fainelli@telecomint.eu>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:18 +01:00
Ralf Baechle
d6d8a4635a [MIPS] Tinker with constraints in <asm/atomic.h> to fix build error.
[...]
  CC      init/main.o
include/asm/bitops.h: In function `start_kernel':
include/asm/bitops.h:76: warning: asm operand 2 probably doesn't match
constraints
include/asm/bitops.h:76: warning: asm operand 2 probably doesn't match
constraints
include/asm/bitops.h:76: warning: asm operand 2 probably doesn't match
constraints
include/asm/bitops.h:76: error: impossible constraint in `asm'
include/asm/bitops.h:76: error: impossible constraint in `asm'
include/asm/bitops.h:76: error: impossible constraint in `asm'
make[1]: *** [init/main.o] Error 1
[...]

The build error is caused by the ages old gcc bug where gcc at the time of
analyzing the constraints is unable to figure out that an "i" constraint
actually can be satisfied and thus will abort unless an "r" is added to
the constraint.  For the actual code generation gcc will only ever use the
"i" constraint.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:18 +01:00
Dmitri Vorobiev
c29d150305 [MIPS] Add missing prototypes to asm/page.h
This patch fixes the following sparse warnings:

>>>>>>>>>>>>>>>>>>
arch/mips/mm/page.c:284:16: warning: symbol
'build_clear_page' was not declared. Should it be static?

arch/mips/mm/page.c:426:16: warning: symbol 'build_copy_page'
was not declared. Should it be static?
>>>>>>>>>>>>>>>>>>

The fix is to add appropriate prototypes to the header
include/asm-mips/page.h.

Build-tested against Malta defconfig.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:17 +01:00
Dmitri Vorobiev
f028b86056 [MIPS] Fix missing prototypes in asm/fpu.h
While building the Malta defconfig, sparse spat the following
warnings:

>>>>>>>>>>>>>>>>>>
arch/mips/math-emu/kernel_linkage.c:31:6: warning: symbol
'fpu_emulator_init_fpu' was not declared. Should it be static?

arch/mips/math-emu/kernel_linkage.c:54:5: warning: symbol
'fpu_emulator_save_context' was not declared. Should it be
static?

arch/mips/math-emu/kernel_linkage.c:68:5: warning: symbol
'fpu_emulator_restore_context' was not declared. Should it be
static?
>>>>>>>>>>>>>>>>>>

This patch fixes these errors by adding the proper prototypes
to the include/asm-mips/fpu.h header, and actually using this
header in the sparse-spotted source file.

Build-tested with Malta defconfig.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:17 +01:00
Dmitri Vorobiev
3450004a8c [MIPS] PCI: Make the pcibios_max_latency variable static
The pcibios_max_latency variable is needlessly defined global, and this
patch makes it static.

Build-tested using malta_defconfig.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-20 14:38:17 +01:00
Mauro Carvalho Chehab
1c22dad8ab V4L/DVB (8395): saa7134: Fix Kbuild dependency of ir-kbd-i2c
Currently, saa7134 is dependent of ir-kbd-i2c, since it uses a symbol that is
defined there. However, as this symbol is used only on saa7134, there's no
sense on keeping it defined there (or on ir-commons).

So, let's move it to saa7134 and remove one symbol for being exported.

Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:29:03 -03:00
Hans Verkuil
2bc93aa304 V4L/DVB (8387): Some cosmetic changes
Those changes, together with some proper patches, will allow out-of-tree 
compilation for for kernels < 2.6.19

Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:28:32 -03:00
Magnus Damm
326c986207 V4L/DVB (8343): soc_camera_platform: Add SoC Camera Platform driver
This patch adds a simple platform camera device. Useful for testing
cameras with SoC camera host drivers. Only one single pixel format
and resolution combination is supported.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:25:49 -03:00
Magnus Damm
0d3244d643 V4L/DVB (8342): sh_mobile_ceu_camera: Add SuperH Mobile CEU driver V3
This is V3 of the SuperH Mobile CEU soc_camera driver.

The CEU hardware block is configured in a transparent data fetch
mode, frames are captured from the attached camera and written to
physically contiguous memory buffers provided by the newly added
videobuf-dma-contig queue. Tested on sh7722 and sh7723 processors.

 Changes since V2:
 - remove SUPERH Kconfig dependency
 - move sh_mobile_ceu.h to include/media
 - add board callback support with enable_camera()/disable_camera()
 - add support for declare_coherent_memory
 - rework video memory limit
 - more verbose error messages

 Changes since V1:
 - fixed the CEU driver to work with the newly updated patches

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:25:43 -03:00
Magnus Damm
2cc45cf25a V4L/DVB (8341): videobuf: Add physically contiguous queue code V3
This is V3 of the physically contiguous videobuf queues patch.
Useful for hardware such as the SuperH Mobile CEU which doesn't
support scatter gatter bus mastering.

Since it may be difficult to allocate large chunks of physically
contiguous memory after some uptime due to fragmentation, this code
allocates memory using dma_alloc_coherent(). Architectures supporting
dma_declare_coherent_memory() can easily avoid fragmentation issues
by using dma_declare_coherent_memory() to force dma_alloc_coherent()
to allocate from a certain pre-allocated memory area.

 Changes since V2
  - use dma_handle for physical address
  - use "scatter gather" instead of "scatter gatter"

 Changes since V1:
  - use dev_err() instead of pr_err()
  - remember size in struct videobuf_dma_contig_memory
  - keep struct videobuf_dma_contig_memory in .c file
  - let videobuf_to_dma_contig() return dma_addr_t
  - implement __videobuf_sync()
  - return statements, white space and other minor fixes

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:25:37 -03:00
Magnus Damm
5d6aaf50e2 V4L/DVB (8340): videobuf: Fix gather spelling
Use "scatter gather" instead of "scatter gatter".

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:25:32 -03:00
Magnus Damm
b15cf1fcce V4L/DVB (8339): soc_camera: Add 16-bit bus width support
The SuperH Mobile CEU hardware supports 16-bit width bus,
so extend the soc_camera code with SOCAM_DATAWIDTH_16.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:25:26 -03:00
Magnus Damm
a034d1b76b V4L/DVB (8338): soc_camera: Move spinlocks
This patch moves the spinlock handling from soc_camera.c to the actual
camera host driver. The spinlock_alloc/free callbacks are replaced with
code in init_videobuf(). So far all camera host drivers implement their
own spinlock_alloc/free methods anyway, and videobuf_queue_core_init()
BUGs on a NULL spinlock argument, so, new camera host drivers will not
forget to provide a spinlock when initialising their videobuf queues.

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:25:21 -03:00
Paulius Zaleckas
092d392119 V4L/DVB (8337): soc_camera: make videobuf independent
Makes SoC camera videobuf independent. Includes all necessary changes for
PXA camera driver (currently the only driver using soc_camera in the mainline).
These changes are important for the future soc_camera based drivers.

Signed-off-by: Paulius Zaleckas <paulius.zaleckas@teltonika.lt>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@pengutronix.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:25:17 -03:00
Jean Delvare
5a367dfb73 V4L/DVB (8246): tvaudio: Stop I2C driver ID abuse
The tvaudio driver is using "official" I2C device IDs for internal
purpose. There must be some historical reason behind this but anyway,
it shouldn't do that. As the stored values are never used, the easiest
way to fix the problem is simply to remove them altogether.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:18:38 -03:00
Jean Delvare
be99af6679 V4L/DVB (8245): ovcamchip: Delete stray I2C bus ID
I2C_HW_SMBUS_OVFX2 is referenced in ovcamchip_core.c, but no bus uses
this driver ID, so we can remove the reference. As far as I can see,
the Cypress FX2 webcam is handled by a different driver (dvb-usb).

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:18:34 -03:00
Hans Verkuil
f87086e302 v4l-dvb: remove legacy checks to allow support for kernels < 2.6.10
Also remove some blank lines that were used to split compat code at -devel
tree.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:17:52 -03:00
Hans de Goede
ab8f12cf8e V4L/DVB (8197): gspca: pac207 frames no more decoded in the subdriver.
videodev2: New pixfmt
pac207:   Remove the specific decoding.
main:     get_buff_size operation added for the subdriver.

Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:17:02 -03:00
Hans de Goede
54ab92ca05 V4L/DVB (8194): gspca: Fix the format of the low resolution mode of spca561.
The low (half) res modes of the spca561 are not spca561 compressed, but are
raw bayer, this patches fixes this and adds a PIX_FMT define for the GBRG
bayer format used by the spca561 in low res mode.

Signed-off-by: Hans de Goede <j.w.r.degoede@hhs.nl>
Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:16:47 -03:00
Jean-Francois Moine
6a7eba24e4 V4L/DVB (8157): gspca: all subdrivers
- remaning subdrivers added
- remove the decoding helper and some specific frame decodings

Signed-off-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:14:49 -03:00
Al Viro
a36ef6b1e0 V4L/DVB (8128): saa7146: ->cpu_addr and friends are little-endian
Annotations + stop saa7146_i2c from playing fast and loose with
reuse of ->cpu_addr for host-endian.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:13:14 -03:00
Hans Verkuil
e0e31cdb91 V4L/DVB (8105): cx2341x: add TS capability
The cx18 can support transport streams with newer firmwares. Add a TS
capability to the generic cx2341x module.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:11:55 -03:00
Hans Verkuil
21575c1312 V4L/DVB (8103): videodev: fix/improve ioctl debugging
Various ioctl debugging fixes and improvements:

- use %x rather than %d for control IDs and bitmask fields
- make two arrays const
- show the whole control array for the ext_ctrl ioctls
- print pix_fmt for V4L2_BUF_TYPE_VIDEO_OUTPUT
- show full type name rather than an integer
- fix CROPCAP debugging
- fix G/S_TUNER debugging
- show error code in case of an error
- other small cleanups

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:11:44 -03:00
brandon@ifup.org
539a7555b3 V4L/DVB (8078): Introduce "index" attribute for persistent video4linux device nodes
A number of V4L drivers have a mod param to specify their preferred minors.
This is because it is often desirable for applications to have a static /dev
name for a particular device.  However, using minors has several disadvantages:

  1) the requested minor may already be taken
  2) using a mod param is driver specific
  3) it requires every driver to add a param
  4) requires configuration by hand

This patch introduces an "index" attribute that when combined with udev rules
can create static device paths like this:

/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video0
/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video1
/dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video2

$ ls -la /dev/v4l/by-path/pci-0000\:00\:1d.2-usb-0\:1\:1.0-video0
lrwxrwxrwx 1 root root 12 2008-04-28 00:02 /dev/v4l/by-path/pci-0000:00:1d.2-usb-0:1:1.0-video0 -> ../../video1

These paths are steady across reboots and should be resistant to rearranging
across Kernel versions.

video_register_device_index is available to drivers to request a
specific index number.

Signed-off-by: Brandon Philips <bphilips@suse.de>
Signed-off-by: Kees Cook <kees@outflux.net>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:10:27 -03:00
Hans Verkuil
78b526a435 V4L/DVB (7949): videodev: renamed the vidioc_*_fmt_* callbacks
The naming for the callbacks that handle the VIDIOC_ENUM_FMT and
VIDIOC_S/G/TRY_FMT ioctls was very confusing. Renamed it to match
the v4l2_buf_type name.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:07:32 -03:00
Hans Verkuil
0e3bd2b999 V4L/DVB (7948): videodev: add missing vidioc_try_fmt_sliced_vbi_output and VIDIOC_ENUMOUTPUT handling
There was no vidioc_try_fmt_sliced_vbi_output, instead vidioc_try_fmt_vbi_output
was reused.

The VIDIOC_ENUMOUTPUT handling was missing altogether, even though the callback
existed.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:07:27 -03:00
Hans Verkuil
b2de2313f1 V4L/DVB (7947): videodev: add vidioc_g_std callback.
The default videodev behavior for VIDIOC_G_STD is not correct for all devices.
Add a new callback that drivers can use instead.

Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:07:22 -03:00
Tobias Lorenz
1d0ba5f378 V4L/DVB (7942): Hardware frequency seek ioctl interface
Signed-off-by: Tobias Lorenz <tobias.lorenz@gmx.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
2008-07-20 07:07:12 -03:00
Marcelo Tosatti
34d4cb8fca KVM: MMU: nuke shadowed pgtable pages and ptes on memslot destruction
Flush the shadow mmu before removing regions to avoid stale entries.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:40 +03:00
Avi Kivity
d6e88aec07 KVM: Prefix some x86 low level function with kvm_, to avoid namespace issues
Fixes compilation with CONFIG_VMI enabled.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:39 +03:00
Christian Borntraeger
180c12fb22 KVM: s390: rename private structures
While doing some tests with our lcrash implementation I have seen a
naming conflict with prefix_info in kvm_host.h vs. addrconf.h

To avoid future conflicts lets rename private definitions in
asm/kvm_host.h by adding the kvm_s390 prefix.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:37 +03:00
Avi Kivity
7a5b56dfd3 KVM: x86 emulator: lazily evaluate segment registers
Instead of prefetching all segment bases before emulation, read them at the
last moment.  Since most of them are unneeded, we save some cycles on
Intel machines where this is a bit expensive.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:35 +03:00
Avi Kivity
f5b4edcd52 KVM: x86 emulator: simplify rip relative decoding
rip relative decoding is relative to the instruction pointer of the next
instruction; by moving address adjustment until after decoding is complete,
we remove the need to determine the instruction size.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:34 +03:00
Tan, Li
9ef621d3be KVM: Support mixed endian machines
Currently kvmtrace is not portable. This will prevent from copying a
trace file from big-endian target to little-endian workstation for analysis.
In the patch, kernel outputs metadata containing a magic number to trace
log, and changes 64-bit words to be u64 instead of a pair of u32s.

Signed-off-by: Tan Li <li.tan@intel.com>
Acked-by: Jerone Young <jyoung5@us.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:32 +03:00
Laurent Vivier
7f39f8ac17 KVM: Add coalesced MMIO support (ia64 part)
This patch enables coalesced MMIO for ia64 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

[akpm: fix compile error on ia64]

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:31 +03:00
Laurent Vivier
588968b6b7 KVM: Add coalesced MMIO support (powerpc part)
This patch enables coalesced MMIO for powerpc architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:31 +03:00
Laurent Vivier
542472b53e KVM: Add coalesced MMIO support (x86 part)
This patch enables coalesced MMIO for x86 architecture.
It defines KVM_MMIO_PAGE_OFFSET and KVM_CAP_COALESCED_MMIO.
It enables the compilation of coalesced_mmio.c.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:31 +03:00
Laurent Vivier
5f94c1741b KVM: Add coalesced MMIO support (common part)
This patch adds all needed structures to coalesce MMIOs.
Until an architecture uses it, it is not compiled.

Coalesced MMIO introduces two ioctl() to define where are the MMIO zones that
can be coalesced:

- KVM_REGISTER_COALESCED_MMIO registers a coalesced MMIO zone.
  It requests one parameter (struct kvm_coalesced_mmio_zone) which defines
  a memory area where MMIOs can be coalesced until the next switch to
  user space. The maximum number of MMIO zones is KVM_COALESCED_MMIO_ZONE_MAX.

- KVM_UNREGISTER_COALESCED_MMIO cancels all registered zones inside
  the given bounds (bounds are also given by struct kvm_coalesced_mmio_zone).

The userspace client can check kernel coalesced MMIO availability by asking
ioctl(KVM_CHECK_EXTENSION) for the KVM_CAP_COALESCED_MMIO capability.
The ioctl() call to KVM_CAP_COALESCED_MMIO will return 0 if not supported,
or the page offset where will be stored the ring buffer.
The page offset depends on the architecture.

After an ioctl(KVM_RUN), the first page of the KVM memory mapped points to
a kvm_run structure. The offset given by KVM_CAP_COALESCED_MMIO is
an offset to the coalesced MMIO ring expressed in PAGE_SIZE relatively
to the address of the start of th kvm_run structure. The MMIO ring buffer
is defined by the structure kvm_coalesced_mmio_ring.

[akio: fix oops during guest shutdown]

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:31 +03:00
Laurent Vivier
92760499d0 KVM: kvm_io_device: extend in_range() to manage len and write attribute
Modify member in_range() of structure kvm_io_device to pass length and the type
of the I/O (write or read).

This modification allows to use kvm_io_device with coalesced MMIO.

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:30 +03:00
Guillaume Thouvenin
3e6e0aab1b KVM: Prefixes segment functions that will be exported with "kvm_"
Prefixes functions that will be exported with kvm_.
We also prefixed set_segment() even if it still static
to be coherent.

signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net>
Signed-off-by: Laurent Vivier <laurent.vivier@bull.net>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:27 +03:00
Avi Kivity
9ba075a664 KVM: MTRR support
Add emulation for the memory type range registers, needed by VMware esx 3.5,
and by pci device assignment.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Avi Kivity
81609e3e26 KVM: Order segment register constants in the same way as cpu operand encoding
This can be used to simplify the x86 instruction decoder.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Sheng Yang
f08864b42a KVM: VMX: Enable NMI with in-kernel irqchip
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:26 +03:00
Sheng Yang
3419ffc8e4 KVM: IOAPIC/LAPIC: Enable NMI support
[avi: fix ia64 build breakage]

Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:25 +03:00
Avi Kivity
7cc8883074 KVM: Remove decache_vcpus_on_cpu() and related callbacks
Obsoleted by the vmx-specific per-cpu list.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:42:25 +03:00
Avi Kivity
4ecac3fd6d KVM: Handle virtualization instruction #UD faults during reboot
KVM turns off hardware virtualization extensions during reboot, in order
to disassociate the memory used by the virtualization extensions from the
processor, and in order to have the system in a consistent state.
Unfortunately virtual machines may still be running while this goes on,
and once virtualization extensions are turned off, any virtulization
instruction will #UD on execution.

Fix by adding an exception handler to virtualization instructions; if we get
an exception during reboot, we simply spin waiting for the reset to complete.
If it's a true exception, BUG() so we can have our stack trace.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:41:43 +03:00
Avi Kivity
1b7fcd3263 KVM: MMU: Fix false flooding when a pte points to page table
The KVM MMU tries to detect when a speculative pte update is not actually
used by demand fault, by checking the accessed bit of the shadow pte.  If
the shadow pte has not been accessed, we deem that page table flooded and
remove the shadow page table, allowing further pte updates to proceed
without emulation.

However, if the pte itself points at a page table and only used for write
operations, the accessed bit will never be set since all access will happen
through the emulator.

This is exactly what happens with kscand on old (2.4.x) HIGHMEM kernels.
The kernel points a kmap_atomic() pte at a page table, and then
proceeds with read-modify-write operations to look at the dirty and accessed
bits.  We get a false flood trigger on the kmap ptes, which results in the
mmu spending all its time setting up and tearing down shadows.

Fix by setting the shadow accessed bit on emulated accesses.

Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:40:50 +03:00
Joerg Roedel
d2ebb4103f KVM: SVM: add tracing support for TDP page faults
To distinguish between real page faults and nested page faults they should be
traced as different events. This is implemented by this patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-20 12:40:48 +03:00
Ingo Molnar
d986434a7d Merge branch 'sched/urgent' into sched/devel 2008-07-20 11:01:29 +02:00
Peter Zijlstra
31656519e1 sched, x86: clean up hrtick implementation
random uvesafb failures were reported against Gentoo:

  http://bugs.gentoo.org/show_bug.cgi?id=222799

and Mihai Moldovan bisected it back to:

> 8f4d37ec07 is first bad commit
> commit 8f4d37ec07
> Author: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date:   Fri Jan 25 21:08:29 2008 +0100
>
>    sched: high-res preemption tick

Linus suspected it to be hrtick + vm86 interaction and observed:

> Btw, Peter, Ingo: I think that commit is doing bad things. They aren't
> _incorrect_ per se, but they are definitely bad.
>
> Why?
>
> Using random _TIF_WORK_MASK flags is really impolite for doing
> "scheduling" work. There's a reason that arch/x86/kernel/entry_32.S
> special-cases the _TIF_NEED_RESCHED flag: we don't want to exit out of
> vm86 mode unnecessarily.
>
> See the "work_notifysig_v86" label, and how it does that
> "save_v86_state()" thing etc etc.

Right, I never liked having to fiddle with those TIF flags. Initially I
needed it because the hrtimer base lock could not nest in the rq lock.
That however is fixed these days.

Currently the only reason left to fiddle with the TIF flags is remote
wakeups. We cannot program a remote cpu's hrtimer. I've been thinking
about using the new and improved IPI function call stuff to implement
hrtimer_start_on().

However that does require that smp_call_function_single(.wait=0) works
from interrupt context - /me looks at the latest series from Jens - Yes
that does seem to be supported, good.

Here's a stab at cleaning this stuff up ...

Mihai reported test success as well.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Mihai Moldovan <ionic@ionic.de>
Cc: Michal Januszewski <spock@gentoo.org>
Cc: Antonino Daplas <adaplas@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 10:37:28 +02:00
Mike Travis
80422d3431 cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP
* Rename CPUMASK_VAR --> CPUMASK_PTR (and simplify)

  * Fix a semantic error in CPUMASK_ALLOC

  * Add a bit of commentry to cpumask.h

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 10:21:12 +02:00
Mike Travis
94a1e869c7 NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var
* Slight optimization when getting one's own cpu_info percpu data.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 10:21:11 +02:00
Ingo Molnar
c4dc59ae7a x86, VisWS: turn into generic arch, eliminate leftover files
remove unused leftovers.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:31:24 +02:00
Yinghai Lu
63b5d7af25 x86: add ->pre_time_init to x86_quirks
so NUMAQ can use that to call numaq_pre_time_init()

This allows us to remove a NUMAQ special from arch/x86/kernel/setup.c.

(and paves the way to remove the NUMAQ subarch)

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:25:52 +02:00
Yinghai Lu
64898a8bad x86: extend and use x86_quirks to clean up NUMAQ code
add these new x86_quirks methods:

	int *mpc_record;
	int (*mpc_apic_id)(struct mpc_config_processor *m);
	void (*mpc_oem_bus_info)(struct mpc_config_bus *m, char *name);
	void (*mpc_oem_pci_bus)(struct mpc_config_bus *m);
	void (*smp_read_mpc_oem)(struct mp_config_oemtable *oemtable,
                                    unsigned short oemsize);

... and move NUMAQ related mps table handling to numaq_32.c.

also move the call to smp_read_mpc_oem() to smp_read_mpc() directly.

Should not change functionality, albeit it would be nice to get it
tested on real NUMAQ as well ...

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:25:52 +02:00
Yinghai Lu
3c9cb6de1e x86: introduce x86_quirks
introduce x86_quirks array of boot-time quirk methods.

No change in functionality intended.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-20 09:18:17 +02:00
Jussi Kivilinna
175f9c1bba net_sched: Add size table for qdiscs
Add size table functions for qdiscs and calculate packet size in
qdisc_enqueue().

Based on patch by Patrick McHardy
 http://marc.info/?l=linux-netdev&m=115201979221729&w=2

Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 00:08:47 -07:00
Jussi Kivilinna
0abf77e55a net_sched: Add accessor function for packet length for qdiscs
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 00:08:27 -07:00
Jussi Kivilinna
5f86173bdf net_sched: Add qdisc_enqueue wrapper
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-20 00:08:04 -07:00
YOSHIFUJI Hideaki
230b183921 net: Use standard structures for generic socket address structures.
Use sockaddr_storage{} for generic socket address storage
and ensures proper alignment.
Use sockaddr{} for pointers to omit several casts.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 22:35:47 -07:00
Michael Hennerich
0138da6101 Blackfin arch: fix bug - detect 0.1 silicon revision BF527-EZKIT as 0.0 version
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-07-19 16:56:53 +08:00
Graf Yang
aa3348f461 Blackfin arch: Add return value check in bfin_sir_probe(), remove SSYNC().
Signed-off-by: Graf Yang <graf.yang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-07-19 15:54:10 +08:00
Sonic Zhang
262c3825a9 Blackfin arch: Extend sram malloc to handle L2 SRAM.
Extend system call to alloc L2 SRAM in application.
Automatically move following sections to L2 SRAM:
1. kernel built-in l2 attribute section
2. kernel module l2 attribute section
3. elf-fdpic application l2 attribute section

Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-07-19 15:42:41 +08:00
David S. Miller
407d819cf0 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2008-07-19 00:30:39 -07:00
Denis V. Lunev
7abbcd6a4c ipv6: remove unused macros from net/ipv6.h
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:29:42 -07:00
Denis V. Lunev
725a8ff04a ipv6: remove unused parameter from ip6_ra_control
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:28:58 -07:00
Adam Langley
4389dded77 tcp: Remove redundant checks when setting eff_sacks
Remove redundant checks when setting eff_sacks and make the number of SACKs a
compile time constant. Now that the options code knows how many SACK blocks can
fit in the header, we don't need to have the SACK code guessing at it.

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:07:02 -07:00
Adam Langley
33ad798c92 tcp: options clean up
This should fix the following bugs:
  * Connections with MD5 signatures produce invalid packets whenever SACK
    options are included
  * MD5 signatures are counted twice in the MSS calculations

Behaviour changes:
  * A SYN with MD5 + SACK + TS elicits a SYNACK with MD5 + SACK

    This is because we can't fit any SACK blocks in a packet with MD5 + TS
    options. There was discussion about disabling SACK rather than TS in
    order to fit in better with old, buggy kernels, but that was deemed to
    be unnecessary.

  * SYNs with MD5 don't include a TS option

    See above.

Additionally, it removes a bunch of duplicated logic for calculating options,
which should help avoid these sort of issues in the future.

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:04:31 -07:00
Adam Langley
49a72dfb88 tcp: Fix MD5 signatures for non-linear skbs
Currently, the MD5 code assumes that the SKBs are linear and, in the case
that they aren't, happily goes off and hashes off the end of the SKB and
into random memory.

Reported by Stephen Hemminger in [1]. Advice thanks to Stephen and Evgeniy
Polyakov. Also includes a couple of missed route_caps from Stephen's patch
in [2].

[1] http://marc.info/?l=linux-netdev&m=121445989106145&w=2
[2] http://marc.info/?l=linux-netdev&m=121459157816964&w=2

Signed-off-by: Adam Langley <agl@imperialviolet.org>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-19 00:01:42 -07:00
Harvey Harrison
336d3262df sctp: remove unnecessary byteshifting, calculate directly in big-endian
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:07:09 -07:00
Vlad Yasevich
7dab83de50 sctp: Support ipv6only AF_INET6 sockets.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:05:40 -07:00
Stephen Hemminger
c1e20f7c8b tcp: RTT metrics scaling
Some of the metrics (RTT, RTTVAR and RTAX_RTO_MIN) are stored in
kernel units (jiffies) and this leaks out through the netlink API to
user space where the units for jiffies are unknown.

This patches changes the kernel to convert to/from milliseconds. This
changes the ABI, but milliseconds seemed like the most natural unit
for these parameters.  Values available via syscall in
/proc/net/rt_cache and netlink will be in milliseconds.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 23:02:15 -07:00
David S. Miller
3072367300 pkt_sched: Manage qdisc list inside of root qdisc.
Idea is from Patrick McHardy.

Instead of managing the list of qdiscs on the device level, manage it
in the root qdisc of a netdev_queue.  This solves all kinds of
visibility issues during qdisc destruction.

The way to iterate over all qdiscs of a netdev_queue is to visit
the netdev_queue->qdisc, and then traverse it's list.

The only special case is to ignore builting qdiscs at the root when
dumping or doing a qdisc_lookup().  That was not needed previously
because builtin qdiscs were not added to the device's qdisc_list.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 22:50:15 -07:00
Matthew Garrett
92c4989092 Input: add switch for dock events
Add a SW_DOCK switch to input.h.  ACPI docks currently send their docking
status as a uevent, but not all docks are ACPI or correspond to a device.
In that case, it makes more sense to simply generate an input event on
docking or undocking.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-07-19 00:52:43 -04:00
Mark Brown
5ec461d083 Input: add microphone insert switch definition
Add a new switch type to the input API for reporting microphone
insertion. This will be used by the ALSA jack reporting API.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-07-19 00:52:36 -04:00
David S. Miller
72b25a913e pkt_sched: Get rid of u32_list.
The u32_list is just an indirect way of maintaining a reference
to a U32 node on a per-qdisc basis.

Just add an explicit node pointer for u32 to struct Qdisc an do
away with this global list.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 20:54:17 -07:00
Patrick McHardy
8913336a7e packet: add PACKET_RESERVE sockopt
Add new sockopt to reserve some headroom in the mmaped ring frames in
front of the packet payload. This can be used f.i. when the VLAN header
needs to be (re)constructed to avoid moving the entire payload.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 18:05:19 -07:00
venkatesh.pallipadi@intel.com
ae79cdaacb x86: Add a arch directory for x86 under debugfs
Add a directory for x86 arch under debugfs. Can be used to accumulate all
x86 specific debugfs files.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 17:22:04 -07:00
Jan Beulich
48fe4a76e2 x86: i386: reduce boot fixmap space
As 256 entries are needed, aligning to a 256-entry boundary is
sufficient and still guarantees the single pte table requirement.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 16:17:52 -07:00
Jan Beulich
08ad8afaa0 x86: reduce force_mwait visibility
It's not used anywhere outside its single referencing file.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 15:55:09 -07:00
Jan Beulich
08e1a13e7d x86: reduce forbid_dac's visibility
It's not used anywhere outside its declaring file.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-18 14:39:37 -07:00
Mike Travis
77586c2bda cpumask: Provide a generic set of CPUMASK_ALLOC macros
* Provide a generic set of CPUMASK_ALLOC macros patterned after the
    SCHED_CPUMASK_ALLOC macros.  This is used where multiple cpumask_t
    variables are declared on the stack to reduce the amount of stack
    space required.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 22:03:00 +02:00
Mike Travis
65c0118453 cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
* This patch replaces the dangerous lvalue version of cpumask_of_cpu
    with new cpumask_of_cpu_ptr macros.  These are patterned after the
    node_to_cpumask_ptr macros.

    In general terms, if there is a cpumask_of_cpu_map[] then a pointer to
    the cpumask_of_cpu_map[cpu] entry is used.  The cpumask_of_cpu_map
    is provided when there is a large NR_CPUS count, reducing
    greatly the amount of code generated and stack space used for
    cpumask_of_cpu().  The pointer to the cpumask_t value is needed for
    calling set_cpus_allowed_ptr() to reduce the amount of stack space
    needed to pass the cpumask_t value.

    If there isn't a cpumask_of_cpu_map[], then a temporary variable is
    declared and filled in with value from cpumask_of_cpu(cpu) as well as
    a pointer variable pointing to this temporary variable.  Afterwards,
    the pointer is used to reference the cpumask value.  The compiler
    will optimize out the extra dereference through the pointer as well
    as the stack space used for the pointer, resulting in identical code.

    A good example of the orthogonal usages is in net/sunrpc/svc.c:

	case SVC_POOL_PERCPU:
	{
		unsigned int cpu = m->pool_to[pidx];
		cpumask_of_cpu_ptr(cpumask, cpu);

		*oldmask = current->cpus_allowed;
		set_cpus_allowed_ptr(current, cpumask);
		return 1;
	}
	case SVC_POOL_PERNODE:
	{
		unsigned int node = m->pool_to[pidx];
		node_to_cpumask_ptr(nodecpumask, node);

		*oldmask = current->cpus_allowed;
		set_cpus_allowed_ptr(current, nodecpumask);
		return 1;
	}

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 22:02:57 +02:00
Ingo Molnar
bb2c018b09 Merge branch 'linus' into cpus4096
Conflicts:

	drivers/acpi/processor_throttling.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 22:00:54 +02:00
Dmitry Baryshkov
9de90ac27d Sh: use generic per-device coherent dma allocator
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 21:14:02 +02:00
Dmitry Baryshkov
1fe532685a ARM: support generic per-device coherent dma mem
Signed-off-by: Dmitry Baryshkov <dbaryshkov@gmail.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 21:14:01 +02:00
Ingo Molnar
f6dc8ccaab Merge branch 'linus' into core/generic-dma-coherent
Conflicts:

	kernel/Makefile

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 21:13:20 +02:00
Ingo Molnar
9b610fda0d Merge branch 'linus' into timers/nohz 2008-07-18 19:53:16 +02:00
Jaswinder Singh
6ac8d51f01 x86: introducing asm-x86/traps.h
Declaring x86 traps under one hood.
Declaring x86 do_traps before defining them.

Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 18:51:57 +02:00
Ingo Molnar
f1b0c8d3d3 Merge branch 'linus' into x86/amd-iommu 2008-07-18 18:43:08 +02:00
Thomas Petazzoni
9781f39fd2 x86: consolidate the definition of the force_mwait variable
The force_mwait variable iss defined either in
arch/x86/kernel/cpu/amd.c or in arch/x86/kernel/setup_64.c, but it is
only initialized and used in arch/x86/kernel/process.c. This patch
moves the declaration to arch/x86/kernel/process.c.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: michael@free-electrons.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 18:39:19 +02:00
Thomas Gleixner
b8f8c3cf0a nohz: prevent tick stop outside of the idle loop
Jack Ren and Eric Miao tracked down the following long standing
problem in the NOHZ code:

	scheduler switch to idle task
	enable interrupts

Window starts here

	----> interrupt happens (does not set NEED_RESCHED)
	      	irq_exit() stops the tick

	----> interrupt happens (does set NEED_RESCHED)

	return from schedule()
	
	cpu_idle(): preempt_disable();

Window ends here

The interrupts can happen at any point inside the race window. The
first interrupt stops the tick, the second one causes the scheduler to
rerun and switch away from idle again and we end up with the tick
disabled.

The fact that it needs two interrupts where the first one does not set
NEED_RESCHED and the second one does made the bug obscure and extremly
hard to reproduce and analyse. Kudos to Jack and Eric.

Solution: Limit the NOHZ functionality to the idle loop to make sure
that we can not run into such a situation ever again.

cpu_idle()
{
	preempt_disable();

	while(1) {
		 tick_nohz_stop_sched_tick(1); <- tell NOHZ code that we
		 			          are in the idle loop

		 while (!need_resched())
		       halt();

		 tick_nohz_restart_sched_tick(); <- disables NOHZ mode
		 preempt_enable_no_resched();
		 schedule();
		 preempt_disable();
	}
}

In hindsight we should have done this forever, but ... 

/me grabs a large brown paperbag.

Debugged-by: Jack Ren <jack.ren@marvell.com>, 
Debugged-by: eric miao <eric.y.miao@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-18 18:10:28 +02:00
Herton Ronaldo Krzesinski
723edb5060 Fix typos from signal_32/64.h merge
Fallout from commit 33185c504f ("x86:
merge signal_32/64.h")

Thanks to Dick Streefland who provided an useful testcase on
http://lkml.org/lkml/2008/3/17/205 (only applicable to 2.6.24.x), that
helped a lot as a deterministic way to bisect an issue that leaded to
this fix.

Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Signed-off-by: Luiz Fernando N. Capitulino <lcapitulino@mandriva.com.br>
Cc: Roland McGrath <roland@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 17:59:13 +02:00
Lai Jiangshan
5127bed588 rcu classic: new algorithm for callbacks-processing(v2)
This is v2, it's a little deference from v1 that I
had send to lkml.
use ACCESS_ONCE
use rcu_batch_after/rcu_batch_before for batch # comparison.

rcutorture test result:
(hotplugs: do cpu-online/offline once per second)

No CONFIG_NO_HZ:           OK, 12hours
No CONFIG_NO_HZ, hotplugs: OK, 12hours
CONFIG_NO_HZ=y:            OK, 24hours
CONFIG_NO_HZ=y, hotplugs:  Failed.
(Failed also without my patch applied, exactly the same bug occurred,
http://lkml.org/lkml/2008/7/3/24)

v1's email thread:
http://lkml.org/lkml/2008/6/2/539

v1's description:

The code/algorithm of the implement of current callbacks-processing
is very efficient and technical. But when I studied it and I found
a disadvantage:

In multi-CPU systems, when a new RCU callback is being
queued(call_rcu[_bh]), this callback will be invoked after the grace
period for the batch with batch number = rcp->cur+2 has completed
very very likely in current implement. Actually, this callback can be
invoked after the grace period for the batch with
batch number = rcp->cur+1 has completed. The delay of invocation means
that latency of synchronize_rcu() is extended. But more important thing
is that the callbacks usually free memory, and these works are delayed
too! it's necessary for reclaimer to free memory as soon as
possible when left memory is few.

A very simple way can solve this problem:
a field(struct rcu_head::batch) is added to record the batch number for
the RCU callback. And when a new RCU callback is being queued, we
determine the batch number for this callback(head->batch = rcp->cur+1)
and we move this callback to rdp->donelist if we find
that head->batch <= rcp->completed when we process callbacks.
This simple way reduces the wait time for invocation a lot. (about
2.5Grace Period -> 1.5Grace Period in average in multi-CPU systems)

This is my algorithm. But I do not add any field for struct rcu_head
in my implement. We just need to memorize the last 2 batches and
their batch number, because these 2 batches include all entries that
for whom the grace period hasn't completed. So we use a special
linked-list rather than add a field.
Please see the comment of struct rcu_data.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Gautham Shenoy <ego@in.ibm.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 16:07:33 +02:00
Lai Jiangshan
3cac97cbb1 rcu classic: simplify the next pending batch
use a batch number(rcp->pending) instead of a flag(rcp->next_pending)

rcu_start_batch() need to change this flag, so mb()s is needed
for memory-access safe.

but(after this patch applied) rcu_start_batch() do not change
this batch number(rcp->pending), rcp->pending is managed by
__rcu_process_callbacks only, and troublesome mb()s are eliminated.

And codes look simpler and clearer.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Gautham Shenoy <ego@in.ibm.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 16:07:32 +02:00
Russ Anderson
7019cc2dd6 x86 BIOS interface for RTC on SGI UV
Real-time code needs to know the number of cycles per second
on SGI UV.  The information is provided via a run time BIOS
call.  This patch provides the linux side of that interface.
This is the first of several run time BIOS calls to be defined
in uv/bios.h and bios_uv.c.

Note that BIOS_CALL() is just a stub for now.  The bios
side is being worked on.

Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:35:14 +02:00
Alexander van Heukelum
8450e85399 x86, cleanup: fix description of __fls(): __fls(0) is undefined
Ricardo M. Correia spotted that the use of __fls() in fls64() did
not seem to make sense. In fact fls64()'s implementation is fine,
but the description of __fls() was wrong. Fix that.

Reported-by: "Ricardo M. Correia" <Ricardo.M.Correia@Sun.COM>
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:32:38 +02:00
Maciej W. Rozycki
35b680557f x86: more apic debugging
[ mingo@elte.hu: picked up this patch from Maciej, lets make apic=debug
                 print out more info - we had a lot of APIC changes ]

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:27:51 +02:00
Maciej W. Rozycki
baa1318841 x86: APIC: Make apic_verbosity unsigned
As a microoptimisation, make apic_verbosity unsigned.  This will make
apic_printk(APIC_QUIET, ...) expand into just printk(...) with the
surrounding condition and a reference to apic_verbosity removed.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:27:43 +02:00
Yinghai Lu
1f067167a8 x86: seperate memtest from init_64.c
it's separate functionality that deserves its own file.

This also prepares 32-bit memtest support.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:10:27 +02:00
Ingo Molnar
1b427c153a sched: fix build error, provide partition_sched_domains() unconditionally
provide an empty partition_sched_domains() definition for the UP case:

 include/linux/cpuset.h: In function ‘rebuild_sched_domains':
 include/linux/cpuset.h:163: error: implicit declaration of function ‘partition_sched_domains'

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 14:02:46 +02:00
Harvey Harrison
3217256188 x86: suppress sparse returning void warnings
include/asm/paravirt.h:1404:2: warning: returning void-valued expression
include/asm/paravirt.h:1414:2: warning: returning void-valued expression

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:42:01 +02:00
Ingo Molnar
2fb5e1e101 Merge branch 'linus' into x86/paravirt-spinlocks
Conflicts:

	arch/x86/kernel/Makefile

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:41:27 +02:00
Max Krasnyansky
e761b77252 cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)
This is based on Linus' idea of creating cpu_active_map that prevents
scheduler load balancer from migrating tasks to the cpu that is going
down.

It allows us to simplify domain management code and avoid unecessary
domain rebuilds during cpu hotplug event handling.

Please ignore the cpusets part for now. It needs some more work in order
to avoid crazy lock nesting. Although I did simplfy and unify domain
reinitialization logic. We now simply call partition_sched_domains() in
all the cases. This means that we're using exact same code paths as in
cpusets case and hence the test below cover cpusets too.
Cpuset changes to make rebuild_sched_domains() callable from various
contexts are in the separate patch (right next after this one).

This not only boots but also easily handles
	while true; do make clean; make -j 8; done
and
	while true; do on-off-cpu 1; done
at the same time.
(on-off-cpu 1 simple does echo 0/1 > /sys/.../cpu1/online thing).

Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing
this on right now in gnome-terminal and things are moving just fine.

Also this is running with most of the debug features enabled (lockdep,
mutex, etc) no BUG_ONs or lockdep complaints so far.

I believe I addressed all of the Dmitry's comments for original Linus'
version. I changed both fair and rt balancer to mask out non-active cpus.
And replaced cpu_is_offline() with !cpu_active() in the main scheduler
code where it made sense (to me).

Signed-off-by: Max Krasnyanskiy <maxk@qualcomm.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Gregory Haskins <ghaskins@novell.com>
Cc: dmitry.adamushko@gmail.com
Cc: pj@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:22:25 +02:00
Sebastian Siewior
2b7207a6b5 ftrace: copy + paste typo in asm/ftrace.h
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <srostedt@redhat.co>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 13:14:08 +02:00
Pavel Emelyanov
b6fcbdb4f2 proc: consolidate per-net single-release callers
They are symmetrical to single_open ones :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:07:44 -07:00
Pavel Emelyanov
de05c557b2 proc: consolidate per-net single_open callers
There are already 7 of them - time to kill some duplicate code.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:07:21 -07:00
Pavel Emelyanov
923c6586b0 mib: put icmpmsg statistics on struct net
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:04:22 -07:00
Pavel Emelyanov
b60538a0d7 mib: put icmp statistics on struct net
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:04:02 -07:00
Pavel Emelyanov
386019d351 mib: put udplite statistics on struct net
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:03:45 -07:00
Pavel Emelyanov
2f275f91a4 mib: put udp statistics on struct net
Similar to... ouch, I repeat myself.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:03:27 -07:00
Pavel Emelyanov
61a7e26028 mib: put net statistics on struct net
Similar to ip and tcp ones :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:03:08 -07:00
Pavel Emelyanov
a20f5799ca mib: put ip statistics on struct net
Similar to tcp one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:02:42 -07:00
Pavel Emelyanov
57ef42d59d mib: put tcp statistics on struct net
Proc temporary uses stats from init_net.

BTW, TCP_XXX_STATS are beautiful (w/o do { } while (0) facing) again :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:02:08 -07:00
Pavel Emelyanov
852566f53c mib: add netns/mib.h file
The only structure declared within is the netns_mib, which will
carry all our mibs within. I didn't put the mibs in the existing
netns_xxx structures to make it possible to mark this one as
properly aligned and get in a separate "read-mostly" cache-line.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 04:01:24 -07:00
Maciej W. Rozycki
593f4a788e x86: APIC: remove apic_write_around(); use alternatives
Use alternatives to select the workaround for the 11AP Pentium erratum
for the affected steppings on the fly rather than build time.  Remove the
X86_GOOD_APIC configuration option and replace all the calls to
apic_write_around() with plain apic_write(), protecting accesses to the
ESR as appropriate due to the 3AP Pentium erratum.  Remove
apic_read_around() and all its invocations altogether as not needed.
Remove apic_write_atomic() and all its implementing backends.  The use of
ASM_OUTPUT2() is not strictly needed for input constraints, but I have
used it for readability's sake.

I had the feeling no one else was brave enough to do it, so I went ahead
and here it is.  Verified by checking the generated assembly and tested
with both a 32-bit and a 64-bit configuration, also with the 11AP
"feature" forced on and verified with gdb on /proc/kcore to work as
expected (as an 11AP machines are quite hard to get hands on these days).
Some script complained about the use of "volatile", but apic_write() needs
it for the same reason and is effectively a replacement for writel(), so I
have disregarded it.

I am not sure what the policy wrt defconfig files is, they are generated
and there is risk of a conflict resulting from an unrelated change, so I
have left changes to them out.  The option will get removed from them at
the next run.

Some testing with machines other than mine will be needed to avoid some
stupid mistake, but despite its volume, the change is not really that
intrusive, so I am fairly confident that because it works for me, it will
everywhere.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-18 12:51:21 +02:00
David S. Miller
49997d7515 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
Conflicts:

	Documentation/powerpc/booting-without-of.txt
	drivers/atm/Makefile
	drivers/net/fs_enet/fs_enet-main.c
	drivers/pci/pci-acpi.c
	net/8021q/vlan.c
	net/iucv/iucv.c
2008-07-18 02:39:39 -07:00
Ingo Molnar
48ae744434 Merge branch 'linus' into x86/step 2008-07-18 10:14:56 +02:00
David S. Miller
432e8765f0 sparc64: Add missing hypervisor service group numbers.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-18 00:43:52 -07:00
David S. Miller
f7fe93344f sparc64: Remove 4MB and 512K base page size options.
Adrian Bunk reported that enabling 4MB page size breaks the build.
The problem is that MAX_ORDER combined with the page shift exceeds the
SECTION_SIZE_BITS we use in asm-sparc64/sparsemem.h

There are several ways I suppose we could work around this.  For one
we could define a CONFIG_FORCE_MAX_ZONEORDER to decrease MAX_ORDER in
these higher page size cases.

But I also know that these page size cases are broken wrt. TLB miss
handling especially on pre-hypervisor systems, and there isn't an easy
way to fix that.

These options were meant to be fun experimental hacks anyways, and
only 8K and 64K make any sense to support.

So remove 512K and 4M base page size support.  Of course, we still
support these page sizes for huge pages.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 23:44:53 -07:00
David S. Miller
d172ad18f9 sparc64: Convert to generic helpers for IPI function calls.
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 23:44:50 -07:00
Sam Ravnborg
f5e706ad88 sparc: join the remaining header files
With this commit all sparc64 header files are moved to asm-sparc.
The remaining files (71 files) were too different to be trivially
merged so divide them up in a _32.h and a _64.h file which
are both included from the file with no bit size.

The following script were used:
cd include
FILES=`wc -l asm-sparc64/*h | grep -v '^     1' | cut -b 20-`

for FILE in ${FILES}; do
  echo $FILE:
  BASE=`echo $FILE | cut -d '.' -f 1`
  FN32=${BASE}_32.h
  FN64=${BASE}_64.h
  GUARD=___ASM_SPARC_`echo $BASE | tr '-' '_' | tr [:lower:] [:upper:]`_H
  git mv asm-sparc/$FILE asm-sparc/$FN32
  git mv asm-sparc64/$FILE asm-sparc/$FN64
  echo git mv done
  printf "#ifndef %s\n" $GUARD                             >   asm-sparc/$FILE
  printf "#define %s\n" $GUARD                             >>  asm-sparc/$FILE
  printf "#if defined(__sparc__) && defined(__arch64__)\n" >>  asm-sparc/$FILE
  printf "#include <asm-sparc/%s>\n" $FN64                 >>  asm-sparc/$FILE
  printf "#else\n"                                         >>  asm-sparc/$FILE
  printf "#include <asm-sparc/%s>\n" $FN32                 >>  asm-sparc/$FILE
  printf "#endif\n"                                        >>  asm-sparc/$FILE
  printf "#endif\n"                                        >>  asm-sparc/$FILE
  git add asm-sparc/$FILE
  echo new file done
  printf "#include <asm-sparc/%s>\n" $FILE                 >  asm-sparc64/$FILE
  git add asm-sparc64/$FILE
  echo sparc64 file done
done

The guard contains three '_' to avoid conflict with existing guards.
In additing the two Kbuild files are emptied to avoid breaking
headers_* targets.
We will reintroduce the exported header files when the necessary
kbuild changes are merged.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:55:51 -07:00
Sam Ravnborg
5e3609f60c sparc: merge header files with trivial differences
A manual inspection revealed that the following headerfiles
contained only trivial differences:
hw_irq.h idprom.h kmap_types.h kvm.h spinlock_types.h sunbpp.h unaligned.h

The only noteworthy change are that sparc64 had a volatile
qualifer that sparc missed in spinlock_types.h.

In addition a few comments were updated.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:45:02 -07:00
Sam Ravnborg
075ae52532 sparc: when header files are equal use asm-sparc version
Used the following script to find equal header files:
SPARC64=`ls asm-sparc64`
for FILE in ${SPARC64}; do
	cmp -s asm-sparc/$FILE asm-sparc64/$FILE;
	if [ $? = 0 ]; then
		printf "#include <asm-sparc/%s>\n" $FILE > asm-sparc64/$FILE
	fi
done

A few of the equal files are a simple include from
asm-generic, but by including the file from asm-sparc
we know they are equal for sparc and sparc64.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:44:58 -07:00
Sam Ravnborg
a00736e936 sparc: copy sparc64 specific files to asm-sparc
Used the following script to copy the files:
cd include
set -e
SPARC64=`ls asm-sparc64`
for FILE in ${SPARC64}; do
	if [ -f asm-sparc/$FILE ]; then
		echo $FILE exist in asm-sparc
	else
		git mv asm-sparc64/$FILE asm-sparc/$FILE
		printf "#include <asm-sparc/$FILE>\n" > asm-sparc64/$FILE
		git add asm-sparc64/$FILE
	fi
done

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:44:53 -07:00
Sam Ravnborg
bdc3135ac9 sparc: Merge asm-sparc{,64}/asi.h
Joined the two files as they contain distinct definitions.
Inspired by patch from: Adrian Bunk <bunk@kernel.org>

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Adrian Bunk <bunk@kernel.org>
2008-07-17 21:42:30 -07:00
Sam Ravnborg
b1a8bf92a0 sparc: export openprom.h to userspace
sparc64 exports openprom.h to userspace so let sparc follow
the example.
As openprom.h pulled in another not-for-export vaddrs.h header
file it required a few changes to fix the build.

The definition af VMALLOC_* were moved to pgtable as this is
where sparc64 has them.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:23 -07:00
Sam Ravnborg
b444b9a5a1 sparc: Merge asm-sparc{,64}/types.h
Copy content of sparc64 file to sparc file.
There is only minimal possibilities for further unification.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:19 -07:00
Sam Ravnborg
c6d1b0e3d2 sparc: Merge asm-sparc{,64}/termios.h
Bring the commit e55c57e0b5
("[SPARC64]: Report any user access faults in termios accessors")
over to sparc when unifying the two files.
The diff was manually inspected to contain no
other relevant changes.

This unification therefore changes functionality of sparc.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:15 -07:00
Sam Ravnborg
943d0e8613 sparc: Merge asm-sparc{,64}/termbits.h
The type of tcflag_t differs from 32 and 64 bit.
For 32 bit it is long
For 64 bit it is int

Altough these have same size then I was not sure that
it was OK to change the 64 bit version to long as this
is part of the ABI so it was made conditional.

:$ diff -u include/asm-sparc/termbits.h include/asm-sparc64/termbits.h
:-- include/asm-sparc/termbits.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/termbits.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,11 +1,11 @@
:-#ifndef _SPARC_TERMBITS_H
:-#define _SPARC_TERMBITS_H
:+#ifndef _SPARC64_TERMBITS_H
:+#define _SPARC64_TERMBITS_H
:
: #include <linux/posix_types.h>
:
: typedef unsigned char   cc_t;
: typedef unsigned int    speed_t;
:-typedef unsigned long   tcflag_t;
:+typedef unsigned int    tcflag_t;
:
: #define NCC 8
: struct termio {
:@@ -102,7 +102,7 @@
: #define IXANY	0x00000800
: #define IXOFF	0x00001000
: #define IMAXBEL	0x00002000
:-#define IUTF8   0x00004000
:+#define IUTF8	0x00004000
:
: /* c_oflag bits */
: #define OPOST	0x00000001
:@@ -171,7 +171,6 @@
: #define HUPCL	  0x00000400
: #define CLOCAL	  0x00000800
: #define CBAUDEX   0x00001000
:-/* We'll never see these speeds with the Zilogs, but for completeness... */
: #define  BOTHER   0x00001000
: #define  B57600   0x00001001
: #define  B115200  0x00001002
:@@ -199,7 +198,7 @@
: #define B3500000  0x00001012
: #define B4000000  0x00001013  */
: #define CIBAUD	  0x100f0000  /* input baud rate (not used) */
:-#define CMSPAR	  0x40000000  /* mark or space (stick) parity */
:+#define CMSPAR    0x40000000  /* mark or space (stick) parity */
: #define CRTSCTS	  0x80000000  /* flow control */
:
: #define IBSHIFT	  16		/* Shift from CBAUD to CIBAUD */
:@@ -258,4 +257,4 @@
: #define	TCSADRAIN	1
: #define	TCSAFLUSH	2
:
:-#endif /* !(_SPARC_TERMBITS_H) */
:+#endif /* !(_SPARC64_TERMBITS_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:12 -07:00
Sam Ravnborg
7c4285d836 sparc: Merge asm-sparc{,64}/setup.h
COMMAND_LINE_SIZE differ for 32 and 64 bit.
256 versus 2048

:$ diff -u include/asm-sparc/setup.h include/asm-sparc64/setup.h
:-- include/asm-sparc/setup.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/setup.h	2008-06-13 06:42:07.000000000 +0200
:@@ -2,9 +2,9 @@
:  *	Just a place holder.
:  */
:
:-#ifndef _SPARC_SETUP_H
:-#define _SPARC_SETUP_H
:+#ifndef _SPARC64_SETUP_H
:+#define _SPARC64_SETUP_H
:
:-#define COMMAND_LINE_SIZE	256
:+#define COMMAND_LINE_SIZE	2048
:
:-#endif /* _SPARC_SETUP_H */
:+#endif /* _SPARC64_SETUP_H */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:09 -07:00
Sam Ravnborg
68a61c8d87 sparc: Merge asm-sparc{,64}/resource.h
RLIM_INFINITY differ from 32 and 64 bit.
The rest is equal.

:$ diff -u include/asm-sparc/resource.h include/asm-sparc64/resource.h
:-- include/asm-sparc/resource.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/resource.h	2008-06-13 06:46:39.000000000 +0200
:@@ -1,11 +1,11 @@
: /*
:  * resource.h: Resource definitions.
:  *
:- * Copyright (C) 1995 David S. Miller (davem@caip.rutgers.edu)
:+ * Copyright (C) 1996 David S. Miller (davem@caip.rutgers.edu)
:  */
:
:-#ifndef _SPARC_RESOURCE_H
:-#define _SPARC_RESOURCE_H
:+#ifndef _SPARC64_RESOURCE_H
:+#define _SPARC64_RESOURCE_H
:
: /*
:  * These two resource limit IDs have a Sparc/Linux-specific ordering,
:@@ -14,13 +14,6 @@
: #define RLIMIT_NOFILE		6	/* max number of open files */
: #define RLIMIT_NPROC		7	/* max number of processes */
:
:-/*
:- * SuS says limits have to be unsigned.
:- * We make this unsigned, but keep the
:- * old value for compatibility:
:- */
:-#define RLIM_INFINITY		0x7fffffff
:-
: #include <asm-generic/resource.h>
:
:-#endif /* !(_SPARC_RESOURCE_H) */
:+#endif /* !(_SPARC64_RESOURCE_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:07 -07:00
Sam Ravnborg
7acc483d21 sparc: Merge asm-sparc{,64}/fbio.h
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:42:04 -07:00
Sam Ravnborg
fc86029910 sparc: copy asm-sparc64/fbio.h to asm-sparc
There were only a few trivial changes and a few additions
in the sparc64 variant of this file.
This patch copies the sparc64 specific bits to the sparc version
of fbio.h so they are equal. A later patch will merge the two.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:55 -07:00
Sam Ravnborg
f92ffa12f4 sparc: Merge asm-sparc{,64}/mman.h
Renaming the function sparc64_mmap_check() to
sparc_mmap_check() was enough to make the two
header files identical.

:$ diff -u include/asm-sparc/mman.h include/asm-sparc64/mman.h
:-- include/asm-sparc/mman.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/mman.h	2008-06-13 06:46:39.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef __SPARC_MMAN_H__
:-#define __SPARC_MMAN_H__
:+#ifndef __SPARC64_MMAN_H__
:+#define __SPARC64_MMAN_H__
:
: #include <asm-generic/mman.h>
:
:@@ -23,9 +23,9 @@
:
: #ifdef __KERNEL__
: #ifndef __ASSEMBLY__
:-#define arch_mmap_check(addr,len,flags)	sparc_mmap_check(addr,len)
:-int sparc_mmap_check(unsigned long addr, unsigned long len);
:+#define arch_mmap_check(addr,len,flags)	sparc64_mmap_check(addr,len)
:+int sparc64_mmap_check(unsigned long addr, unsigned long len);
: #endif
: #endif
:
:-#endif /* __SPARC_MMAN_H__ */
:+#endif /* __SPARC64_MMAN_H__ */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:51 -07:00
Sam Ravnborg
2d1419624c sparc: Merge asm-sparc{,64}/shmbuf.h
Padding in the shmbuf structure made conditional
as only 32 bit sparc did so.

:$ diff -u include/asm-sparc/shmbuf.h include/asm-sparc64/shmbuf.h
:-- include/asm-sparc/shmbuf.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/shmbuf.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,23 +1,19 @@
:-#ifndef _SPARC_SHMBUF_H
:-#define _SPARC_SHMBUF_H
:+#ifndef _SPARC64_SHMBUF_H
:+#define _SPARC64_SHMBUF_H
:
: /*
:- * The shmid64_ds structure for sparc architecture.
:+ * The shmid64_ds structure for sparc64 architecture.
:  * Note extra padding because this structure is passed back and forth
:  * between kernel and user space.
:  *
:  * Pad space is left for:
:- * - 64-bit time_t to solve y2038 problem
:- * - 2 miscellaneous 32-bit values
:+ * - 2 miscellaneous 64-bit values
:  */
:
: struct shmid64_ds {
: 	struct ipc64_perm	shm_perm;	/* operation perms */
:-	unsigned int		__pad1;
: 	__kernel_time_t		shm_atime;	/* last attach time */
:-	unsigned int		__pad2;
: 	__kernel_time_t		shm_dtime;	/* last detach time */
:-	unsigned int		__pad3;
: 	__kernel_time_t		shm_ctime;	/* last change time */
: 	size_t			shm_segsz;	/* size of segment (bytes) */
: 	__kernel_pid_t		shm_cpid;	/* pid of creator */
:@@ -39,4 +35,4 @@
: 	unsigned long	__unused4;
: };
:
:-#endif /* _SPARC_SHMBUF_H */
:+#endif /* _SPARC64_SHMBUF_H */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:48 -07:00
Sam Ravnborg
fcb07081f2 sparc: Merge asm-sparc{,64}/sembuf.h
Padding in the sembuf structure made conditional
as only 32 bit sparc did so.

:$ diff -u include/asm-sparc/sembuf.h include/asm-sparc64/sembuf.h
:-- include/asm-sparc/sembuf.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/sembuf.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,21 +1,18 @@
:-#ifndef _SPARC_SEMBUF_H
:-#define _SPARC_SEMBUF_H
:+#ifndef _SPARC64_SEMBUF_H
:+#define _SPARC64_SEMBUF_H
:
: /*
:- * The semid64_ds structure for sparc architecture.
:+ * The semid64_ds structure for sparc64 architecture.
:  * Note extra padding because this structure is passed back and forth
:  * between kernel and user space.
:  *
:  * Pad space is left for:
:- * - 64-bit time_t to solve y2038 problem
:- * - 2 miscellaneous 32-bit values
:+ * - 2 miscellaneous 64-bit values
:  */
:
: struct semid64_ds {
: 	struct ipc64_perm sem_perm;		/* permissions .. see ipc.h */
:-	unsigned int	__pad1;
: 	__kernel_time_t	sem_otime;		/* last semop time */
:-	unsigned int	__pad2;
: 	__kernel_time_t	sem_ctime;		/* last change time */
: 	unsigned long	sem_nsems;		/* no. of semaphores in array */
: 	unsigned long	__unused1;

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:44 -07:00
Sam Ravnborg
100b10d752 sparc: Merge asm-sparc{,64}/msgbuf.h
Padding from 32 bit sparc kept using preprocessor magic

:$ diff -u include/asm-sparc/msgbuf.h include/asm-sparc64/msgbuf.h
:-- include/asm-sparc/msgbuf.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/msgbuf.h	2008-06-13 06:42:07.000000000 +0200
:@@ -7,17 +7,13 @@
:  * between kernel and user space.
:  *
:  * Pad space is left for:
:- * - 64-bit time_t to solve y2038 problem
:- * - 2 miscellaneous 32-bit values
:+ * - 2 miscellaneous 64-bit values
:  */
:
: struct msqid64_ds {
: 	struct ipc64_perm msg_perm;
:-	unsigned int   __pad1;
: 	__kernel_time_t msg_stime;	/* last msgsnd time */
:-	unsigned int   __pad2;
: 	__kernel_time_t msg_rtime;	/* last msgrcv time */
:-	unsigned int   __pad3;
: 	__kernel_time_t msg_ctime;	/* last change time */
: 	unsigned long  msg_cbytes;	/* current number of bytes on queue */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:42 -07:00
Sam Ravnborg
6d1f4b88ee sparc: Merge asm-sparc{,64}/fcntl.h
The definition of O_NDELAY differed - the rest was equal

:$ diff -u include/asm-sparc/fcntl.h include/asm-sparc64/fcntl.h
:-- include/asm-sparc/fcntl.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/fcntl.h	2008-06-13 06:46:39.000000000 +0200
:@@ -1,8 +1,9 @@
:-#ifndef _SPARC_FCNTL_H
:-#define _SPARC_FCNTL_H
:+#ifndef _SPARC64_FCNTL_H
:+#define _SPARC64_FCNTL_H
:
: /* open/fcntl - O_SYNC is only implemented on blocks devices and on files
:    located on an ext2 file system */
:+#define O_NDELAY	0x0004
: #define O_APPEND	0x0008
: #define FASYNC		0x0040	/* fcntl, for BSD compatibility */
: #define O_CREAT		0x0200	/* not fcntl */
:@@ -10,7 +11,6 @@
: #define O_EXCL		0x0800	/* not fcntl */
: #define O_SYNC		0x2000
: #define O_NONBLOCK	0x4000
:-#define O_NDELAY	(0x0004 | O_NONBLOCK)
: #define O_NOCTTY	0x8000	/* not fcntl */
: #define O_LARGEFILE	0x40000
: #define O_DIRECT        0x100000 /* direct disk access hint */
:@@ -29,8 +29,7 @@
: #define F_UNLCK		3
:
: #define __ARCH_FLOCK_PAD	short __unused;
:-#define __ARCH_FLOCK64_PAD	short __unused;
:
: #include <asm-generic/fcntl.h>
:
:-#endif
:+#endif /* !(_SPARC64_FCNTL_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:39 -07:00
Sam Ravnborg
e880e8701c sparc: Merge asm-sparc{,64}/sockios.h
:$ diff -u include/asm-sparc/sockios.h include/asm-sparc64/sockios.h
:-- include/asm-sparc/sockios.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/sockios.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef _ASM_SPARC_SOCKIOS_H
:-#define _ASM_SPARC_SOCKIOS_H
:+#ifndef _ASM_SPARC64_SOCKIOS_H
:+#define _ASM_SPARC64_SOCKIOS_H
:
: /* Socket-level I/O control calls. */
: #define FIOSETOWN 	0x8901
:@@ -10,5 +10,5 @@
: #define SIOCGSTAMP	0x8906		/* Get stamp (timeval) */
: #define SIOCGSTAMPNS	0x8907		/* Get stamp (timespec) */
:
:-#endif /* !(_ASM_SPARC_SOCKIOS_H) */
:+#endif /* !(_ASM_SPARC64_SOCKIOS_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:35 -07:00
Sam Ravnborg
c8b8be5427 sparc: Merge asm-sparc{,64}/socket.h
:$ diff -u include/asm-sparc/socket.h include/asm-sparc64/socket.h
:-- include/asm-sparc/socket.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/socket.h	2008-06-13 06:46:39.000000000 +0200
:@@ -48,11 +48,10 @@
: #define SO_TIMESTAMPNS		0x0021
: #define SCM_TIMESTAMPNS		SO_TIMESTAMPNS
:
:-#define SO_MARK			0x0022
:-
: /* Security levels - as per NRL IPv6 - don't actually do anything */
: #define SO_SECURITY_AUTHENTICATION		0x5001
: #define SO_SECURITY_ENCRYPTION_TRANSPORT	0x5002
: #define SO_SECURITY_ENCRYPTION_NETWORK		0x5004
:
:+#define SO_MARK			0x0022
: #endif /* _ASM_SOCKET_H */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:27 -07:00
Sam Ravnborg
62e612f0ab sparc: Merge asm-sparc{,64}/poll.h
:$ diff -u include/asm-sparc/poll.h include/asm-sparc64/poll.h
:-- include/asm-sparc/poll.h	2008-06-13 06:42:07.000000000 +0200
:++ include/asm-sparc64/poll.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef __SPARC_POLL_H
:-#define __SPARC_POLL_H
:+#ifndef __SPARC64_POLL_H
:+#define __SPARC64_POLL_H
:
: #define POLLWRNORM	POLLOUT
: #define POLLWRBAND	256

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:24 -07:00
Sam Ravnborg
4835bd988e sparc: Merge asm-sparc{,64}/param.h
:$ diff -u include/asm-sparc/param.h include/asm-sparc64/param.h
:-- include/asm-sparc/param.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/param.h	2008-06-13 06:42:07.000000000 +0200
:@@ -1,5 +1,6 @@
:-#ifndef _ASMSPARC_PARAM_H
:-#define _ASMSPARC_PARAM_H
:+#ifndef _ASMSPARC64_PARAM_H
:+#define _ASMSPARC64_PARAM_H
:+
:
: #ifdef __KERNEL__
: # define HZ		CONFIG_HZ	/* Internal kernel timer frequency */
:@@ -19,4 +20,4 @@
:
: #define MAXHOSTNAMELEN	64	/* max length of hostname */
:
:-#endif
:+#endif /* _ASMSPARC64_PARAM_H */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:20 -07:00
Sam Ravnborg
f1ba03cac2 sparc: Merge asm-sparc{,64}/ioctls.h
Trivial differenses in comments - used the version from sparc64

:$ diff -u include/asm-sparc/ioctls.h include/asm-sparc64/ioctls.h
:-- include/asm-sparc/ioctls.h	2008-06-13 08:46:29.000000000 +0200
:++ include/asm-sparc64/ioctls.h	2008-06-13 08:46:29.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef _ASM_SPARC_IOCTLS_H
:-#define _ASM_SPARC_IOCTLS_H
:+#ifndef _ASM_SPARC64_IOCTLS_H
:+#define _ASM_SPARC64_IOCTLS_H
:
: #include <asm/ioctl.h>
:
:@@ -22,7 +22,7 @@
:
: /* Note that all the ioctls that are not available in Linux have a
:  * double underscore on the front to: a) avoid some programs to
:- * thing we support some ioctls under Linux (autoconfiguration stuff)
:+ * think we support some ioctls under Linux (autoconfiguration stuff)
:  */
: /* Little t */
: #define TIOCGETD	_IOR('t', 0, int)
:@@ -110,7 +110,7 @@
: #define TIOCSERGETLSR   0x5459 /* Get line status register */
: #define TIOCSERGETMULTI 0x545A /* Get multiport config  */
: #define TIOCSERSETMULTI 0x545B /* Set multiport config */
:-#define TIOCMIWAIT	0x545C /* Wait input */
:+#define TIOCMIWAIT	0x545C /* Wait for change on serial input line(s) */
: #define TIOCGICOUNT	0x545D /* Read serial port inline interrupt counts */
:
: /* Kernel definitions */
:@@ -133,4 +133,4 @@
: #define TIOCPKT_NOSTOP		16
: #define TIOCPKT_DOSTOP		32
:
:-#endif /* !(_ASM_SPARC_IOCTLS_H) */
:+#endif /* !(_ASM_SPARC64_IOCTLS_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:16 -07:00
Sam Ravnborg
278864fac7 sparc: Merge asm-sparc{,64}/ioctl.h
:$ diff -u include/asm-sparc/ioctl.h include/asm-sparc64/ioctl.h
:-- include/asm-sparc/ioctl.h	2008-06-13 06:46:39.000000000 +0200
:++ include/asm-sparc64/ioctl.h	2008-06-13 08:46:29.000000000 +0200
:@@ -1,5 +1,5 @@
:-#ifndef _SPARC_IOCTL_H
:-#define _SPARC_IOCTL_H
:+#ifndef _SPARC64_IOCTL_H
:+#define _SPARC64_IOCTL_H
:
:/*
:* Our DIR and SIZE overlap in order to simulteneously provide
:@@ -64,4 +64,4 @@
:#define IOCSIZE_MASK    (_IOC_XSIZEMASK << _IOC_SIZESHIFT)
:#define IOCSIZE_SHIFT   (_IOC_SIZESHIFT)
:
:-#endif /* !(_SPARC_IOCTL_H) */
:+#endif /* !(_SPARC64_IOCTL_H) */

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:41:06 -07:00
Sam Ravnborg
09d3e1baa1 sparc: copy exported sparc64 specific header files to asm-sparc
Copy was done using the following simple script:

set -e
SPARC64="h display7seg.h envctrl.h psrcompat.h pstate.h uctx.h utrap.h watchdog.h"
for FILE in ${SPARC64}; do
	if [ -f asm-sparc/$FILE ]; then
		echo $FILE exist in asm-sparc
	fi
	cat asm-sparc64/$FILE > asm-sparc/$FILE
	printf "#include <asm-sparc/$FILE>\n" > asm-sparc64/$FILE
done

The name of the copied files are added to asm-sparc/Kbuild
to keep "make headers_check" functional.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2008-07-17 21:40:49 -07:00
David S. Miller
597631f2ea sparc64 Kbuild: apb.h and bbc.h should not be exported to userspace
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:47 -07:00
Sam Ravnborg
3f261e829f sparc: Merge include/asm-sparc{,64}/perfctr.h
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:42 -07:00
Sam Ravnborg
fc491d7da8 sparc: Merge include/asm-sparc{,64}/openpromio.h
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:36 -07:00
Adrian Bunk
d6eaadfbbd sparc: remove PROM_AP1000
This seems to be left from the long gone AP1000 support.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:24 -07:00
Adrian Bunk
908f5162ca sparc64/kernel/: make code static
This patch makes the following needlessly global code static:
- central.c: struct central_bus
- central.c: struct fhc_list
- central.c: apply_fhc_ranges()
- central.c: apply_central_ranges()
- ds.c: struct ds_states_template[]
- pci_msi.c: sparc64_setup_msi_irq()
- pci_msi.c: sparc64_teardown_msi_irq()
- pci_sun4v.c: struct sun4v_dma_ops
- sys_sparc32.c: cp_compat_stat64()

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:08 -07:00
Adrian Bunk
50215d6511 sparc/mm/: possible cleanups
This patch contains the following possible cleanups:
- make the following needlessly global code static:
  - fault.c: force_user_fault()
  - init.c: calc_max_low_pfn()
  - init.c: pgt_cache_water[]
  - init.c: map_high_region()
  - srmmu.c: hwbug_bitmask
  - srmmu.c: srmmu_swapper_pg_dir
  - srmmu.c: srmmu_context_table
  - srmmu.c: is_hypersparc
  - srmmu.c: srmmu_cache_pagetables
  - srmmu.c: srmmu_nocache_size
  - srmmu.c: srmmu_nocache_end
  - srmmu.c: srmmu_get_nocache()
  - srmmu.c: srmmu_free_nocache()
  - srmmu.c: srmmu_early_allocate_ptable_skeleton()
  - srmmu.c: srmmu_nocache_calcsize()
  - srmmu.c: srmmu_nocache_init()
  - srmmu.c: srmmu_alloc_thread_info()
  - srmmu.c: early_pgtable_allocfail()
  - srmmu.c: srmmu_early_allocate_ptable_skeleton()
  - srmmu.c: srmmu_allocate_ptable_skeleton()
  - srmmu.c: srmmu_inherit_prom_mappings()
  - sunami.S: tsunami_copy_1page
- remove the following unused code:
  - init.c: struct sparc_aliases

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:38:01 -07:00
Adrian Bunk
c61c65cdcd sparc/kernel/: possible cleanups
This patch contains the following possible cleanups:
- make the following needlessly global code static:
  - apc.c: apc_swift_idle()
  - ebus.c: ebus_blacklist_irq()
  - ebus.c: fill_ebus_child()
  - ebus.c: fill_ebus_device()
  - entry.S: syscall_is_too_hard
  - etra: tsetup_sun4c_stackchk
  - head.S: cputyp
  - head.S: prom_vector_p
  - idprom.c: Sun_Machines[]
  - ioport.c: _sparc_find_resource()
  - ioport.c: create_proc_read_entry()
  - irq.c: struct sparc_irq[]
  - rtrap.S: sun4c_rett_stackchk
  - setup.c: prom_sync_me()
  - setup.c: boot_flags
  - sun4c_irq.c: sun4c_sbint_to_irq()
  - sun4d_irq.c: sbus_tid[]
  - sun4d_irq.c: struct sbus_actions
  - sun4d_irq.c: sun4d_sbint_to_irq()
  - sun4m_irq.c: sun4m_sbint_to_irq()
  - sun4m_irq.c: sun4m_get_irqmask()
  - sun4m_irq.c: sun4m_timers
  - sun4m_smp.c: smp4m_cross_call()
  - sun4m_smp.c: smp4m_blackbox_id()
  - sun4m_smp.c: smp4m_blackbox_current()
  - time.c: sp_clock_typ
  - time.c: sbus_time_init()
  - traps.c: instruction_dump()
  - wof.S: spwin_sun4c_stackchk
  - wuf.S: sun4c_fwin_stackchk
- #if 0 the following unused code:
  - process.c: sparc_backtrace_lock
  - process.c: __show_backtrace()
  - process.c: show_backtrace()
  - process.c: smp_show_backtrace_all_cpus()
- remove the following unused code:
  - entry.S: __handle_exception
  - smp.c: smp_num_cpus
  - smp.c: smp_activated
  - smp.c: __cpu_number_map[]
  - smp.c: __cpu_logical_map[]
  - smp.c: bitops_spinlock
  - traps.c: trap_curbuf
  - traps.c: trapbuf[]
  - traps.c: linux_smp_still_initting
  - traps.c: thiscpus_tbr
  - traps.c: thiscpus_mid

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 21:37:46 -07:00
David S. Miller
93245dd6d3 pkt_sched: Don't used locked skb_queue_purge() in __qdisc_reset_queue()
We have to have exclusive access to the given qdisc anyways, so
doing even more locking is superfluous.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:32 -07:00
David S. Miller
8387400092 pkt_sched: Kill netdev_queue lock.
We can simply use the qdisc->q.lock for all of the
qdisc tree synchronization.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:30 -07:00
David S. Miller
c7e4f3bbb4 pkt_sched: Kill qdisc_lock_tree and qdisc_unlock_tree.
No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:29 -07:00
David S. Miller
78a5b30b73 pkt_sched: Rework {sch,tbf}_tree_lock().
Make sch_tree_lock() lock the qdisc's root.  All of the
users hold the RTNL semaphore and the root qdisc is not
changing.

Implement tbf_tree_{lock,unlock}() simply in terms of
sch_tree_{lock,unlock}().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:28 -07:00
David S. Miller
ead81cc5fc netdevice: Move qdisc_list back into net_device proper.
And give it it's own lock.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:26 -07:00
David S. Miller
37437bb2e1 pkt_sched: Schedule qdiscs instead of netdev_queue.
When we have shared qdiscs, packets come out of the qdiscs
for multiple transmit queues.

Therefore it doesn't make any sense to schedule the transmit
queue when logically we cannot know ahead of time the TX
queue of the SKB that the qdisc->dequeue() will give us.

Just for sanity I added a BUG check to make sure we never
get into a state where the noop_qdisc is scheduled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:20 -07:00
David S. Miller
7698b4fcab pkt_sched: Add and use qdisc_root() and qdisc_root_lock().
When code wants to lock the qdisc tree state, the logic
operation it's doing is locking the top-level qdisc that
sits of the root of the netdev_queue.

Add qdisc_root_lock() to represent this and convert the
easiest cases.

In order for this to work out in all cases, we have to
hook up the noop_qdisc to a dummy netdev_queue.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:19 -07:00
David S. Miller
e2627c8c22 pkt_sched: Make QDISC_RUNNING a qdisc state.
Currently it is associated with a netdev_queue, but when we have
qdisc sharing that no longer makes any sense.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:18 -07:00
David S. Miller
d3b753db7c pkt_sched: Move gso_skb into Qdisc.
We liberate any dangling gso_skb during qdisc destruction.

It really only matters for the root qdisc.  But when qdiscs
can be shared by multiple netdev_queue objects, we can't
have the gso_skb in the netdev_queue any more.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:18 -07:00
David S. Miller
92831bc395 netdev: Kill plain netif_schedule()
No more users.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:16 -07:00
David S. Miller
51cb6db0f5 mac80211: Reimplement WME using ->select_queue().
The only behavior change is that we do not drop packets under any
circumstances.  If that is absolutely needed, we could easily add it
back.

With cleanups and help from Johannes Berg.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:12 -07:00
David S. Miller
eae792b722 netdev: Add netdev->select_queue() method.
Devices or device layers can set this to control the queue selection
performed by dev_pick_tx().

This function runs under RCU protection, which allows overriding
functions to have some way of synchronizing with things like dynamic
->real_num_tx_queues adjustments.

This makes the spinlock prefetch in dev_queue_xmit() a little bit
less effective, but that's the price right now for correctness.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:10 -07:00
David S. Miller
e3c50d5d25 netdev: netdev_priv() can now be sane again.
The private area of a netdev is now at a fixed offset once more.

Unfortunately, some assumptions that netdev_priv() == netdev->priv
crept back into the tree.  In particular this happened in the
loopback driver.  Make it use netdev->ml_priv.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:09 -07:00
David S. Miller
6b0fb1261a netdev: Kill struct net_device_subqueue and netdev->egress_subqueue*
No longer used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:08 -07:00
David S. Miller
fd2ea0a79f net: Use queue aware tests throughout.
This effectively "flips the switch" by making the core networking
and multiqueue-aware drivers use the new TX multiqueue structures.

Non-multiqueue drivers need no changes.  The interfaces they use such
as netif_stop_queue() degenerate into an operation on TX queue zero.
So everything "just works" for them.

Code that really wants to do "X" to all TX queues now invokes a
routine that does so, such as netif_tx_wake_all_queues(),
netif_tx_stop_all_queues(), etc.

pktgen and netpoll required a little bit more surgery than the others.

In particular the pktgen changes, whilst functional, could be largely
improved.  The initial check in pktgen_xmit() will sometimes check the
wrong queue, which is mostly harmless.  The thing to do is probably to
invoke fill_packet() earlier.

The bulk of the netpoll changes is to make the code operate solely on
the TX queue indicated by by the SKB queue mapping.

Setting of the SKB queue mapping is entirely confined inside of
net/core/dev.c:dev_pick_tx().  If we end up needing any kind of
special semantics (drops, for example) it will be implemented here.

Finally, we now have a "real_num_tx_queues" which is where the driver
indicates how many TX queues are actually active.

With IGB changes from Jeff Kirsher.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:07 -07:00
David S. Miller
1d8ae3fdeb pkt_sched: Remove RR scheduler.
This actually fixes a bug added by the RR scheduler changes.  The
->bands and ->prio2band parameters were being set outside of the
sch_tree_lock() and thus could result in strange behavior and
inconsistencies.

It might be possible, in the new design (where there will be one qdisc
per device TX queue) to allow similar functionality via a TX hash
algorithm for RR but I really see no reason to export this aspect of
how these multiqueue cards actually implement the scheduling of the
the individual DMA TX rings and the single physical MAC/PHY port.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:04 -07:00
David S. Miller
09e83b5d7d netdev: Kill NETIF_F_MULTI_QUEUE.
There is no need for a feature bit for something that
can be tested by simply checking the TX queue count.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:03 -07:00
David S. Miller
e8a0464cc9 netdev: Allocate multiple queues for TX.
alloc_netdev_mq() now allocates an array of netdev_queue
structures for TX, based upon the queue_count argument.

Furthermore, all accesses to the TX queues are now vectored
through the netdev_get_tx_queue() and netdev_for_each_tx_queue()
interfaces.  This makes it easy to grep the tree for all
things that want to get to a TX queue of a net device.

Problem spots which are not really multiqueue aware yet, and
only work with one queue, can easily be spotted by grepping
for all netdev_get_tx_queue() calls that pass in a zero index.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-17 19:21:00 -07:00
Dan Williams
2a46fa13d7 iop_adma: cleanup iop_chan_xor_slot_count
- use a table for iop13xx, trade text for data
- shrink the iop3xx to a cache line

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-17 17:59:56 -07:00
Dan Williams
0839875e0c async_tx: make async_tx_test_ack a boolean routine
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-17 17:59:56 -07:00
Dan Williams
3dce017137 async_tx: remove depend_tx from async_tx_sync_epilog
All callers of async_tx_sync_epilog have called async_tx_quiesce on the
depend_tx, so async_tx_sync_epilog need only call the callback to
complete the operation.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-17 17:59:55 -07:00
Dan Williams
d2c52b7983 async_tx: export async_tx_quiesce
Replace open coded "wait and acknowledge" instances with async_tx_quiesce.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2008-07-17 17:59:55 -07:00
Joel Becker
a6795e9ebb configfs: Allow ->make_item() and ->make_group() to return detailed errors.
The configfs operations ->make_item() and ->make_group() currently
return a new item/group.  A return of NULL signifies an error.  Because
of this, -ENOMEM is the only return code bubbled up the stack.

Multiple folks have requested the ability to return specific error codes
when these operations fail.  This patch adds that ability by changing the
->make_item/group() ops to return ERR_PTR() values.  These errors are
bubbled up appropriately.  NULL returns are changed to -ENOMEM for
compatibility.

Also updated are the in-kernel users of configfs.

This is a rework of reverted commit 11c3b79218.

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2008-07-17 15:21:29 -07:00
Ingo Molnar
393d81aa02 Merge branch 'linus' into xen-64bit 2008-07-17 23:57:20 +02:00
Joel Becker
f89ab8619e Revert "configfs: Allow ->make_item() and ->make_group() to return detailed errors."
This reverts commit 11c3b79218.  The code
will move to PTR_ERR().

Signed-off-by: Joel Becker <joel.becker@oracle.com>
2008-07-17 14:53:48 -07:00
H. Peter Anvin
4fdf08b5bf x86: unify and correct the GDT_ENTRY() macro
Merge the GDT_ENTRY() macro between arch/x86/boot/pm.c and
arch/x86/kernel/acpi/sleep.c and put the new one in
<asm-x86/segment.h>.

While we're at it, correct the bitmasks for the limit and flags.  The
new version relies on using ULL constants in order to cause type
promotion rather than explicit casts; this avoids having to include
<linux/types.h> in <asm-x86/segments.h>.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-17 11:29:24 -07:00
Dimitri Sivanich
8cac39b99b [IA64] Update ia64 mmr list for SGI uv
This patch updates the ia64 mmr list for SGI uv.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2008-07-17 11:27:05 -07:00
Linus Torvalds
5b664cb235 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
  [PATCH] ocfs2: fix oops in mmap_truncate testing
  configfs: call drop_link() to cleanup after create_link() failure
  configfs: Allow ->make_item() and ->make_group() to return detailed errors.
  configfs: Fix failing mkdir() making racing rmdir() fail
  configfs: Fix deadlock with racing rmdir() and rename()
  configfs: Make configfs_new_dirent() return error code instead of NULL
  configfs: Protect configfs_dirent s_links list mutations
  configfs: Introduce configfs_dirent_lock
  ocfs2: Don't snprintf() without a format.
  ocfs2: Fix CONFIG_OCFS2_DEBUG_FS #ifdefs
  ocfs2/net: Silence build warnings on sparc64
  ocfs2: Handle error during journal load
  ocfs2: Silence an error message in ocfs2_file_aio_read()
  ocfs2: use simple_read_from_buffer()
  ocfs2: fix printk format warnings with OCFS2_FS_STATS=n
  [PATCH 2/2] ocfs2: Instrument fs cluster locks
  [PATCH 1/2] ocfs2: Add CONFIG_OCFS2_FS_STATS config option
2008-07-17 10:55:51 -07:00
Tony Luck
fca515fbfa Pull pvops into release branch 2008-07-17 10:53:37 -07:00
Linus Torvalds
2b04be7e8a Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix asm/e820.h for userspace inclusion
  x86: fix numaq_tsc_disable
  x86: fix kernel_physical_mapping_init() for large x86 systems
2008-07-17 10:38:59 -07:00
Rusty Russell
2567d71cc7 x86: fix asm/e820.h for userspace inclusion
asm-x86/e820.h is included from userspace.  'x86: make e820.c to have
common functions' (b79cd8f126) broke it:

	make -C Documentation/lguest
	cc -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include
lguest.c  -lz -o lguest
	In file included from ../../include/asm-x86/bootparam.h:8,
	                 from lguest.c:45:
	../../include/asm/e820.h:66: error: expected ‘)’ before ‘start’
	../../include/asm/e820.h:67: error: expected ‘)’ before ‘start’
	../../include/asm/e820.h:68: error: expected ‘)’ before ‘start’
	../../include/asm/e820.h:72: error: expected ‘=’, ‘,’, ‘;’, ‘asm’
or ‘__attribute__’ before ‘e820_update_range’
	...

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-17 19:28:48 +02:00
Linus Torvalds
42fea1f385 Merge branch 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace
* 'ptrace-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace:
  fix dangling zombie when new parent ignores children
  do_wait: return security_task_wait() error code in place of -ECHILD
  ptrace children revamp
  do_wait reorganization
2008-07-17 09:15:23 -07:00
Jan Glauber
779e6e1c72 [S390] qdio: new qdio driver.
List of major changes:
- split qdio driver into several files
- seperation of thin interrupt code
- improved handling for multiple thin interrupt devices
- inbound and outbound processing now always runs in tasklet context
- significant less tasklet schedules per interrupt needed
- merged qebsm with non-qebsm handling
- cleanup qdio interface and added kerneldoc
- coding style

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Utz Bacher <utz.bacher@de.ibm.com>
Reviewed-by: Ursula Braun <braunu@de.ibm.com>
Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-07-17 17:22:10 +02:00
Adrian Bunk
626f311737 [S390] chsc headers userspace cleanup
Kernel headers shouldn't expose functions to userspace.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-07-17 17:22:08 +02:00
Neil Horman
9a6d276e85 core: add stat to track unresolved discards in neighbor cache
in __neigh_event_send, if we have a neighbour entry which is in
NUD_INCOMPLETE state, we enqueue any outbound frames to that neighbour
to the neighbours arp_queue, which is default capped to a length of 3
skbs.  If that queue exceeds its set length, it will drop an skb on
the queue to enqueue the newly arrived skb.  This results in a drop
for which we have no statistics incremented.  This patch adds an
unresolved_discards stat to /proc/net/stat/ndisc_cache to track these
lost frames.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:50:49 -07:00
Pavel Emelyanov
ed88098e25 mib: add net to NET_ADD_STATS_USER
Done with NET_XXX_STATS macros :)

To be continued...

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:32:45 -07:00
Pavel Emelyanov
f2bf415cfe mib: add net to NET_ADD_STATS_BH
This one is tricky. 

The thing is that this macro is only used when killing tw buckets, 
but since this killer is promiscuous wrt to which net each particular
tw belongs to, I have to use it only when NET_NS is off. When the net
namespaces are on, I use the INET_INC_STATS_BH for each bucket.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:32:25 -07:00
Pavel Emelyanov
6f67c817fc mib: add net to NET_INC_STATS_USER
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:31:39 -07:00
Pavel Emelyanov
de0744af1f mib: add net to NET_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:31:16 -07:00
Pavel Emelyanov
4e6734447d mib: add net to NET_INC_STATS
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:30:14 -07:00
Pavel Emelyanov
5c52ba170f sock: add net to prot->enter_memory_pressure callback
The tcp_enter_memory_pressure calls NET_INC_STATS, but doesn't
have where to get the net from.

I decided to add a sk argument, not the net itself, only to factor
all the required sock_net(sk) calls inside the enter_memory_pressure 
callback itself.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:28:10 -07:00
Pavel Emelyanov
cf1100a7a4 mib: add net to TCP_ADD_STATS_USER
Now we're done with the TCP_XXX_STATS macros.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:27:38 -07:00
Pavel Emelyanov
74688e487a mib: add net to TCP_DEC_STATS
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:22:46 -07:00
Pavel Emelyanov
63231bddf6 mib: add net to TCP_INC_STATS_BH
Same as before - the sock is always there to get the net from,
but there are also some places with the net already saved on 
the stack.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:22:25 -07:00
Pavel Emelyanov
81cc8a75d9 mib: add net to TCP_INC_STATS
Fortunately (almost) all the TCP code has a sock to get the net from :)

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:22:04 -07:00
Pavel Emelyanov
a9c19329ec tcp: add net to tcp_mib_init
This one sets TCP MIBs after zeroing them, and thus requires
the net.

The existing single caller can use init_net (temporarily).

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:21:42 -07:00
Pavel Emelyanov
f10f84314d mib: drop unused TCP_XXX_STATS macros
TCP_INC_STATS_USER and TCP_ADD_STATS_BH are currently unused.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:21:20 -07:00
Pavel Emelyanov
c5346fe396 mib: add net to IP_ADD_STATS_BH
Very simple - only ip_evictor (fragments) requires such.
This patch ends up the IP_XXX_STATS patching.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:20:33 -07:00
Pavel Emelyanov
7c73a6faff mib: add net to IP_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:20:11 -07:00
Pavel Emelyanov
5e38e27044 mib: add net to IP_INC_STATS
All the callers already have either the net itself, or the place
where to get it from.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:19:49 -07:00
Pavel Emelyanov
c6f8f7e3bb mib: drop unused IP_INC_STATS_USER
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-16 20:19:26 -07:00
Roland McGrath
f470021adb ptrace children revamp
ptrace no longer fiddles with the children/sibling links, and the
old ptrace_children list is gone.  Now ptrace, whether of one's own
children or another's via PTRACE_ATTACH, just uses the new ptraced
list instead.

There should be no user-visible difference that matters.  The only
change is the order in which do_wait() sees multiple stopped
children and stopped ptrace attachees.  Since wait_task_stopped()
was changed earlier so it no longer reorders the children list, we
already know this won't cause any new problems.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 18:02:33 -07:00
Linus Torvalds
dc7c65db28 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (72 commits)
  Revert "x86/PCI: ACPI based PCI gap calculation"
  PCI: remove unnecessary volatile in PCIe hotplug struct controller
  x86/PCI: ACPI based PCI gap calculation
  PCI: include linux/pm_wakeup.h for device_set_wakeup_capable
  PCI PM: Fix pci_prepare_to_sleep
  x86/PCI: Fix PCI config space for domains > 0
  Fix acpi_pm_device_sleep_wake() by providing a stub for CONFIG_PM_SLEEP=n
  PCI: Simplify PCI device PM code
  PCI PM: Introduce pci_prepare_to_sleep and pci_back_from_sleep
  PCI ACPI: Rework PCI handling of wake-up
  ACPI: Introduce new device wakeup flag 'prepared'
  ACPI: Introduce acpi_device_sleep_wake function
  PCI: rework pci_set_power_state function to call platform first
  PCI: Introduce platform_pci_power_manageable function
  ACPI: Introduce acpi_bus_power_manageable function
  PCI: make pci_name use dev_name
  PCI: handle pci_name() being const
  PCI: add stub for pci_set_consistent_dma_mask()
  PCI: remove unused arch pcibios_update_resource() functions
  PCI: fix pci_setup_device()'s sprinting into a const buffer
  ...

Fixed up conflicts in various files (arch/x86/kernel/setup_64.c,
arch/x86/pci/irq.c, arch/x86/pci/pci.h, drivers/acpi/sleep/main.c,
drivers/pci/pci.c, drivers/pci/pci.h, include/acpi/acpi_bus.h) from x86
and ACPI updates manually.
2008-07-16 17:25:46 -07:00
Kumar Gala
6cfd8990e2 powerpc: rework FSL Book-E PTE access and TLB miss
This converts the FSL Book-E PTE access and TLB miss handling to match
with the recent changes to 44x that introduce support for non-atomic PTE
operations in pgtable-ppc32.h and removes write back to the PTE from
the TLB miss handlers. In addition, the DSI interrupt code no longer
tries to fixup write permission, this is left to generic code, and
_PAGE_HWWRITE is gone.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-16 17:57:51 -05:00
Kumar Gala
b219108cba fs_enet: Remove !CONFIG_PPC_CPM_NEW_BINDING code
Now that arch/ppc is gone we always define CONFIG_PPC_CPM_NEW_BINDING so
we can remove all the code associated with !CONFIG_PPC_CPM_NEW_BINDING.

Also fixed some asm/of_platform.h to linux/of_platform.h (and of_device.h)

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-16 17:57:49 -05:00
Scott Wood
d87eb12785 gianfar: Add magic packet and suspend/resume support.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-16 17:57:47 -05:00
Andy Fleming
7e1cc9c55a powerpc: Fix a bunch of sparse warnings in the qe_lib
Mostly having to do with not marking things __iomem.  And some failure
to use appropriate accessors to read MMIO regs.

Signed-off-by: Andy Fleming <afleming@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-16 17:57:45 -05:00
Scott Wood
d49747bdfb powerpc/mpc83xx: Power Management support
Basic PM support for 83xx.  Standby is implemented as sleep.
Suspend-to-RAM is implemented as "deep sleep" (with the processor
turned off) on 831x.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2008-07-16 17:57:30 -05:00
Linus Torvalds
8a0ca91e1d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc: (68 commits)
  sdio_uart: Fix SDIO break control to now return success or an error
  mmc: host driver for Ricoh Bay1Controllers
  sdio: sdio_io.c Fix sparse warnings
  sdio: fix the use of hard coded timeout value.
  mmc: OLPC: update vdd/powerup quirk comment
  mmc: fix spares errors of sdhci.c
  mmc: remove multiwrite capability
  wbsd: fix bad dma_addr_t conversion
  atmel-mci: Driver for Atmel on-chip MMC controllers
  mmc: fix sdio_io sparse errors
  mmc: wbsd.c fix shadowing of 'dma' variable
  MMC: S3C24XX: Refuse incorrectly aligned transfers
  MMC: S3C24XX: Add maintainer entry
  MMC: S3C24XX: Update error debugging.
  MMC: S3C24XX: Add media presence test to request handling.
  MMC: S3C24XX: Fix use of msecs where jiffies are needed
  MMC: S3C24XX: Add MODULE_ALIAS() entries for the platform devices
  MMC: S3C24XX: Fix s3c2410_dma_request() return code check.
  MMC: S3C24XX: Allow card-detect on non-IRQ capable pin
  MMC: S3C24XX: Ensure host->mrq->data is valid
  ...

Manually fixed up bogus executable bits on drivers/mmc/core/sdio_io.c
and include/linux/mmc/sdio_func.h when merging.
2008-07-16 15:17:52 -07:00
Linus Torvalds
9c1be0c471 Merge branch 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6
* 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6:
  UBIFS: include to compilation
  UBIFS: add new flash file system
  UBIFS: add brief documentation
  MAINTAINERS: add UBIFS section
  do_mounts: allow UBI root device name
  VFS: export sync_sb_inodes
  VFS: move inode_lock into sync_sb_inodes
2008-07-16 15:02:57 -07:00
Linus Torvalds
42fdd144a4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
  IDE: Report errors during drive reset back to user space
  Update documentation of HDIO_DRIVE_RESET ioctl
  IDE: Remove unused code
  IDE: Fix HDIO_DRIVE_RESET handling
  hd.c: remove the #include <linux/mc146818rtc.h>
  update the BLK_DEV_HD help text
  move ide/legacy/hd.c to drivers/block/
  ide/legacy/hd.c: use late_initcall()
  remove BLK_DEV_HD_ONLY
  ide: endian annotations in ide-floppy.c
  ide-floppy: zero out the whole struct ide_atapi_pc on init
  ide-floppy: fold idefloppy_create_test_unit_ready_cmd into idefloppy_open
  ide-cd: move request prep chunk from cdrom_do_newpc_cont to rq issue path
  ide-cd: move request prep from cdrom_start_rw_cont to rq issue path
  ide-cd: move request prep from cdrom_start_seek_continuation to rq issue path
  ide-cd: fold cdrom_start_seek into ide_cd_do_request
  ide-cd: simplify request issuing path
  ide-cd: mv ide_do_rw_cdrom ide_cd_do_request
  ide-cd: cdrom_start_seek: remove unused argument block
  ide-cd: ide_do_rw_cdrom: add the catch-all bad request case to the if-else block
  ...
2008-07-16 14:53:54 -07:00
Linus Torvalds
4314652bb4 Merge branch 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6
* 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-merge-2.6: (87 commits)
  Fix FADT parsing
  Add the ability to reset the machine using the RESET_REG in ACPI's FADT table.
  ACPI: use dev_printk when possible
  PNPACPI: add support for HP vendor-specific CCSR descriptors
  PNP: avoid legacy IDE IRQs
  PNP: convert resource options to single linked list
  ISAPNP: handle independent options following dependent ones
  PNP: remove extra 0x100 bit from option priority
  PNP: support optional IRQ resources
  PNP: rename pnp_register_*_resource() local variables
  PNPACPI: ignore _PRS interrupt numbers larger than PNP_IRQ_NR
  PNP: centralize resource option allocations
  PNP: remove redundant pnp_can_configure() check
  PNP: make resource assignment functions return 0 (success) or -EBUSY (failure)
  PNP: in debug resource dump, make empty list obvious
  PNP: improve resource assignment debug
  PNP: increase I/O port & memory option address sizes
  PNP: introduce pnp_irq_mask_t typedef
  PNP: make resource option structures private to PNP subsystem
  PNP: define PNP-specific IORESOURCE_IO_* flags alongside IRQ, DMA, MEM
  ...
2008-07-16 14:52:12 -07:00
Martin K. Petersen
d442cc44c0 block: Trivial fix for blk_integrity_rq()
Fail integrity check gracefully when request does not have a bio
attached (BLOCK_PC).

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-16 14:51:41 -07:00
Linus Torvalds
8df1b049bc Merge git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (82 commits)
  NFSv4: Remove BKL from the nfsv4 state recovery
  SUNRPC: Remove the BKL from the callback functions
  NFS: Remove BKL from the readdir code
  NFS: Remove BKL from the symlink code
  NFS: Remove BKL from the sillydelete operations
  NFS: Remove the BKL from the rename, rmdir and unlink operations
  NFS: Remove BKL from NFS lookup code
  NFS: Remove the BKL from nfs_link()
  NFS: Remove the BKL from the inode creation operations
  NFS: Remove BKL usage from open()
  NFS: Remove BKL usage from the write path
  NFS: Remove the BKL from the permission checking code
  NFS: Remove attribute update related BKL references
  NFS: Remove BKL requirement from attribute updates
  NFS: Protect inode->i_nlink updates using inode->i_lock
  nfs: set correct fl_len in nlmclnt_test()
  SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon
  SUNRPC: Refactor rpcb_register to make rpcbindv4 support easier
  SUNRPC: None of rpcb_create's callers wants a privileged source port
  SUNRPC: Introduce a specific rpcb_create for contacting localhost
  ...
2008-07-16 14:49:49 -07:00
Aaron Durbin
4d3870431d Add the ability to reset the machine using the RESET_REG in ACPI's FADT table.
Signed-off-by: Aaron Durbin <adurbin@google.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2008-07-16 23:27:08 +02:00
Bjorn Helgaas
1f32ca31e7 PNP: convert resource options to single linked list
ISAPNP, PNPBIOS, and ACPI describe the "possible resource settings" of
a device, i.e., the possibilities an OS bus driver has when it assigns
I/O port, MMIO, and other resources to the device.

PNP used to maintain this "possible resource setting" information in
one independent option structure and a list of dependent option
structures for each device.  Each of these option structures had lists
of I/O, memory, IRQ, and DMA resources, for example:

  dev
    independent options
      ind-io0  -> ind-io1  ...
      ind-mem0 -> ind-mem1 ...
      ...
    dependent option set 0
      dep0-io0  -> dep0-io1  ...
      dep0-mem0 -> dep0-mem1 ...
      ...
    dependent option set 1
      dep1-io0  -> dep1-io1  ...
      dep1-mem0 -> dep1-mem1 ...
      ...
    ...

This data structure was designed for ISAPNP, where the OS configures
device resource settings by writing directly to configuration
registers.  The OS can write the registers in arbitrary order much
like it writes PCI BARs.

However, for PNPBIOS and ACPI devices, the OS uses firmware interfaces
that perform device configuration, and it is important to pass the
desired settings to those interfaces in the correct order.  The OS
learns the correct order by using firmware interfaces that return the
"current resource settings" and "possible resource settings," but the
option structures above doesn't store the ordering information.

This patch replaces the independent and dependent lists with a single
list of options.  For example, a device might have possible resource
settings like this:

  dev
    options
      ind-io0 -> dep0-io0 -> dep1->io0 -> ind-io1 ...

All the possible settings are in the same list, in the order they
come from the firmware "possible resource settings" list.  Each entry
is tagged with an independent/dependent flag.  Dependent entries also
have a "set number" and an optional priority value.  All dependent
entries must be assigned from the same set.  For example, the OS can
use all the entries from dependent set 0, or all the entries from
dependent set 1, but it cannot mix entries from set 0 with entries
from set 1.

Prior to this patch PNP didn't keep track of the order of this list,
and it assigned all independent options first, then all dependent
ones.  Using the example above, that resulted in a "desired
configuration" list like this:

  ind->io0 -> ind->io1 -> depN-io0 ...

instead of the list the firmware expects, which looks like this:

  ind->io0 -> depN-io0 -> ind-io1 ...

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:07 +02:00
Bjorn Helgaas
d5ebde6ef5 PNP: support optional IRQ resources
This patch adds an IORESOURCE_IRQ_OPTIONAL flag for use when
assigning resources to a device.  If the flag is set and we are
unable to assign an IRQ to the device, we can leave the IRQ
disabled but allow the overall resource allocation to succeed.

Some devices request an IRQ, but can run without an IRQ
(possibly with degraded performance).  This flag lets us run
the device without the IRQ instead of just leaving the
device disabled.

This is a reimplementation of this previous change by Rene
Herman <rene.herman@gmail.com>:
    http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=3b73a223661ed137c5d3d2635f954382e94f5a43

I reimplemented this for two reasons:
    - to prepare for converting all resource options into a single linked
      list, as opposed to the per-resource-type lists we have now, and
    - to preserve the order and number of resource options.

In PNPBIOS and ACPI, we configure a device by giving firmware a
list of resource assignments.  It is important that this list
has exactly the same number of resources, in the same order,
as the "template" list we got from the firmware in the first
place.

The problem of a sound card MPU401 being left disabled for want of
an IRQ was reported by Uwe Bugla <uwe.bugla@gmx.de>.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:07 +02:00
Bjorn Helgaas
a1802c4295 PNP: make resource option structures private to PNP subsystem
Nothing outside the PNP subsystem should need access to a
device's resource options, so this patch moves the option
structure declarations to a private header file.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:06 +02:00
Bjorn Helgaas
08c9f262f2 PNP: define PNP-specific IORESOURCE_IO_* flags alongside IRQ, DMA, MEM
PNP previously defined PNP_PORT_FLAG_16BITADDR and PNP_PORT_FLAG_FIXED
in a private header file, but put those flags in struct resource.flags
fields.  Better to make them IORESOURCE_IO_* flags like the existing
IRQ, DMA, and MEM flags.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:06 +02:00
Bjorn Helgaas
57fd51a8be PNP: add pnp_possible_config() -- can a device could be configured this way?
As part of a heuristic to identify modem devices, 8250_pnp.c
checks to see whether a device can be configured at any of the
legacy COM port addresses.

This patch moves the code that traverses the PNP "possible resource
options" from 8250_pnp.c to the PNP subsystem.  This encapsulation
is important because a future patch will change the implementation
of those resource options.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:06 +02:00
Bjorn Helgaas
aee3ad815d PNP: replace pnp_resource_table with dynamically allocated resources
PNP used to have a fixed-size pnp_resource_table for tracking the
resources used by a device.  This table often overflowed, so we've
had to increase the table size, which wastes memory because most
devices have very few resources.

This patch replaces the table with a linked list of resources where
the entries are allocated on demand.

This removes messages like these:

    pnpacpi: exceeded the max number of IO resources
    00:01: too many I/O port resources

References:

    http://bugzilla.kernel.org/show_bug.cgi?id=9535
    http://bugzilla.kernel.org/show_bug.cgi?id=9740
    http://lkml.org/lkml/2007/11/30/110

This patch also changes the way PNP uses the IORESOURCE_UNSET,
IORESOURCE_AUTO, and IORESOURCE_DISABLED flags.

Prior to this patch, the pnp_resource_table entries used the flags
like this:

    IORESOURCE_UNSET
	This table entry is unused and available for use.  When this flag
	is set, we shouldn't look at anything else in the resource structure.
	This flag is set when a resource table entry is initialized.

    IORESOURCE_AUTO
	This resource was assigned automatically by pnp_assign_{io,mem,etc}().

	This flag is set when a resource table entry is initialized and
	cleared whenever we discover a resource setting by reading an ISAPNP
	config register, parsing a PNPBIOS resource data stream, parsing an
	ACPI _CRS list, or interpreting a sysfs "set" command.

	Resources marked IORESOURCE_AUTO are reinitialized and marked as
	IORESOURCE_UNSET by pnp_clean_resource_table() in these cases:

	    - before we attempt to assign resources automatically,
	    - if we fail to assign resources automatically,
	    - after disabling a device

    IORESOURCE_DISABLED
	Set by pnp_assign_{io,mem,etc}() when automatic assignment fails.
	Also set by PNPBIOS and PNPACPI for:

	    - invalid IRQs or GSI registration failures
	    - invalid DMA channels
	    - I/O ports above 0x10000
	    - mem ranges with negative length

After this patch, there is no pnp_resource_table, and the resource list
entries use the flags like this:

    IORESOURCE_UNSET
	This flag is no longer used in PNP.  Instead of keeping
	IORESOURCE_UNSET entries in the resource list, we remove
	entries from the list and free them.

    IORESOURCE_AUTO
	No change in meaning: it still means the resource was assigned
	automatically by pnp_assign_{port,mem,etc}(), but these functions
	now set the bit explicitly.

	We still "clean" a device's resource list in the same places,
	but rather than reinitializing IORESOURCE_AUTO entries, we
	just remove them from the list.

	Note that IORESOURCE_AUTO entries are always at the end of the
	list, so removing them doesn't reorder other list entries.
	This is because non-IORESOURCE_AUTO entries are added by the
	ISAPNP, PNPBIOS, or PNPACPI "get resources" methods and by the
	sysfs "set" command.  In each of these cases, we completely free
	the resource list first.

    IORESOURCE_DISABLED
	In addition to the cases where we used to set this flag, ISAPNP now
	adds an IORESOURCE_DISABLED resource when it reads a configuration
	register with a "disabled" value.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:05 +02:00
Bjorn Helgaas
20bfdbba72 PNP: make pnp_{port,mem,etc}_start(), et al work for invalid resources
Some callers use pnp_port_start() and similar functions without
making sure the resource is valid.  This patch makes us fall
back to returning the initial values if the resource is not
valid or not even present.

This mostly preserves the previous behavior, where we would just
return the initial values set by pnp_init_resource_table().  The
original 2.6.25 code didn't range-check the "bar", so it would
return garbage if the bar exceeded the table size.  This code
returns sensible values instead.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:05 +02:00
Zhao Yakui
da5e09a1b3 ACPI : Create "idle=nomwait" bootparam
"idle=nomwait" disables the use of the MWAIT
instruction from both C1 (C1_FFH) and deeper (C2C3_FFH)
C-states.

When MWAIT is unavailable, the BIOS and OS generally
negotiate to use the HALT instruction for C1,
and use IO accesses for deeper C-states.

This option is useful for power and performance
comparisons, and also to work around BIOS bugs
where broken MWAIT support is advertised.

http://bugzilla.kernel.org/show_bug.cgi?id=10807
http://bugzilla.kernel.org/show_bug.cgi?id=10914

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Li Shaohua <shaohua.li@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:05 +02:00
Zhao Yakui
c1e3b377ad ACPI: Create "idle=halt" bootparam
"idle=halt" limits the idle loop to using
the halt instruction.  No MWAIT, no IO accesses,
no C-states deeper than C1.

If something is broken in the idle code,
"idle=halt" is a less severe workaround
than "idle=poll" which disables all power savings.

Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:05 +02:00
Zhang Rui
71b58cbb0c ACPI: Enhance /sys/firmware/interrupts to allow enable/disable/clear from user-space
Allow users to enable/disable/clear a specific & valid GPE/Fixed Event
in user space.

This is useful for debugging, especially for some
interrupt storm issues.

All wakeup GPEs are disabled and they can not be enabled at runtime,
and we mark them as invalid.

All GPEs that don't have a _Lxx/_Exx method are marked as invalid.

All Fixed Events that don't have an event handler are marked as invalid
and they can't be enabled until an event handler is registered.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Ling Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore
9c9f6d052d ACPICA: Update version to 20080609
Update version to 20080609.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore
71d993e115 ACPICA: Cleanup debug operand dump mechanism
Eliminated unnecessary operands; eliminated use of negative index
in loop.  Operands now displayed in correct order, not backwards.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore
75e5b5fb77 ACPICA: Update disassembler for DMAR table changes
Now supports the 2007 intel Virtualization Technology for Directed
I/O specification.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore
19d0cfe9dd ACPICA: Update DMAR and SRAT table definitions
Synchronized tables with current specifications.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore
b25d2a470b ACPICA: Update version to 20080514
Update version to 20080514

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore
4b8ed63167 ACPICA: Add const qualifier for appropriate string constants
Mostly MODULE_NAME and printf format strings.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:04 +02:00
Bob Moore
67a119f990 ACPICA: Eliminate acpi_native_uint type v2
No longer needed; replaced mostly with u32, but also acpi_size
where a type that changes 32/64 bit on 32/64-bit platforms is
required.

v2: Fix a cast of a 32-bit int to a pointer in ACPI to avoid a compiler warning.
from David Howells

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:03 +02:00
Bob Moore
11f2a61ab4 ACPICA: Fix possible negative array index in acpi_ut_validate_exception
Added NULL fields to the exception string arrays to eliminate
the -1 subtraction on the SubStatus field.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:03 +02:00
Jan Beulich
6719561f9b ACPICA: Update tracking macros to reduce code/data size
Changed ACPI_MODULE_NAME and ACPI_FUNCTION_NAME to use arrays of
strings instead of pointers to static strings. Jan Beulich and
Bob Moore.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:03 +02:00
Bob Moore
c91d924e3a ACPICA: Fix for hang on GPE method invocation
Fixes problem where the new method argument count validation mechanism
will enter an infinite loop when a GPE method is dispatched.
Problem fixed be removing the obsolete code that passes GPE block
information to the notify handler via the control method parameter pointer.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:03 +02:00
Bob Moore
f3454ae810 ACPICA: Add argument count checking to control method invocation via acpi_evaluate_object
Error if too few arguments, warning if too many. This applies
only to external programmatic control method execution, not
method-to-method calls within the AML.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:03 +02:00
Rafael J. Wysocki
ebb12db51f Freezer: Introduce PF_FREEZER_NOSIG
The freezer currently attempts to distinguish kernel threads from
user space tasks by checking if their mm pointer is unset and it
does not send fake signals to kernel threads.  However, there are
kernel threads, mostly related to networking, that behave like
user space tasks and may want to be sent a fake signal to be frozen.

Introduce the new process flag PF_FREEZER_NOSIG that will be set
by default for all kernel threads and make the freezer only send
fake signals to the tasks having PF_FREEZER_NOSIG unset.  Provide
the set_freezable_with_signal() function to be called by the kernel
threads that want to be sent a fake signal for freezing.

This patch should not change the freezer's observable behavior.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:03 +02:00
David Brownell
2fe2de5f6c ACPI PM: acpi_pm_device_sleep_state() cleanup
Get rid of a superfluous acpi_pm_device_sleep_state() parameter.  The
only legitimate value of that parameter must be derived from the first
parameter, which is what all the callers already do.  (However, this
does not address the fact that ACPI still doesn't set up those flags.)

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-07-16 23:27:02 +02:00
Vegard Nossum
47c00d2bc2 ACPICA: fix mutex names in debug code.
Reorder the mutex names to match the preceding #defines

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:01 +02:00
Bob Moore
e38e8a0743 Make GPE disable more robust
Implemented another change for the GPE disable. We now perform a
read-change-write of the enable register instead of simply writing out the
cached enable mask. This will prevent inadvertent enabling of GPEs if a rogue
GPE is received during initialization (before GPE handlers are installed.)

http://bugzilla.kernel.org/show_bug.cgi?id=6217

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:01 +02:00
Mike Travis
706546d023 ACPI: change processors from array to per_cpu variable
Change processors from an array sized by NR_CPUS to a per_cpu variable.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2008-07-16 23:27:01 +02:00
Roland McGrath
380fdd7585 x86 ptrace: user-sets-TF nits
This closes some arcane holes in single-step handling that can arise
only when user programs set TF directly (via popf or sigreturn) and
then use vDSO (syscall/sysenter) system call entry.  In those entry
paths, the clear_TF_reenable case hits and we must check TIF_SINGLESTEP
to be sure our bookkeeping stays correct wrt the user's view of TF.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 12:15:17 -07:00
Roland McGrath
d4d6715016 x86 ptrace: unify syscall tracing
This unifies and cleans up the syscall tracing code on i386 and x86_64.

Using a single function for entry and exit tracing on 32-bit made the
do_syscall_trace() into some terrible spaghetti.  The logic is clear and
simple using separate syscall_trace_enter() and syscall_trace_leave()
functions as on 64-bit.

The unification adds PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP support
on x86_64, for 32-bit ptrace() callers and for 64-bit ptrace() callers
tracing either 32-bit or 64-bit tasks.  It behaves just like 32-bit.

Changing syscall_trace_enter() to return the syscall number shortens
all the assembly paths, while adding the SYSEMU feature in a simple way.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 12:15:17 -07:00
Roland McGrath
64f0973319 x86 ptrace: unify TIF_SINGLESTEP
This unifies the treatment of TIF_SINGLESTEP on i386 and x86_64.
The bit is now excluded from _TIF_WORK_MASK on i386 as it has been
on x86_64.  This means the do_notify_resume() path using it is never
used, so TIF_SINGLESTEP is not cleared on returning to user mode.

Both now leave TIF_SINGLESTEP set when returning to user, so that
it's already set on an int $0x80 system call entry.  This removes
the need for testing TF on the system_call path.  Doing it this way
fixes the regression for PTRACE_SINGLESTEP into a sigreturn syscall,
introduced by commit 1e2e99f0e4.

The clear_TF_reenable case that sets TIF_SINGLESTEP can only happen
on a non-exception kernel entry, i.e. sysenter/syscall instruction.
That will always get to the syscall exit tracing path.

Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-16 12:15:16 -07:00
Elias Oltmanns
3ef5eb424e IDE: Remove unused code
Remove some code which has been made obsolete and hasn't worked properly
before anyway.  Part of the infrastructure may be reintroduced in a
follow up patch to implement a working command aborting facility.

Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
Cc: "Randy Dunlap" <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:48 +02:00
Elias Oltmanns
79e36a9f54 IDE: Fix HDIO_DRIVE_RESET handling
Currently, the code path executing an HDIO_DRIVE_RESET ioctl is broken
in various ways.  Most importantly, it is treated as an out of band
request in an illegal way which may very likely lead to system lock ups.
Use the drive's request queue to avoid this problem (and fix a locking
issue for free along the way).

Signed-off-by: Elias Oltmanns <eo@nebensachen.de>
Cc: "Alan Cox" <alan@lxorguk.ukuu.org.uk>
Cc: "Randy Dunlap" <randy.dunlap@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:48 +02:00
Bartlomiej Zolnierkiewicz
e6d95bd149 ide: ->port_init_devs -> ->init_dev
Change ->port_init_devs method to take 'ide_drive_t *' as an argument
instead of 'ide_hwif_t *' and rename it to ->init_dev.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:42 +02:00
Bartlomiej Zolnierkiewicz
c56c5648a3 ide: set hwif->dev in ide_init_port_hw() (take 2)
* Add 'parent' field to hw_regs_t for optional parent device pointer (needed
  by macio PMAC IDE controllers) and set hwif->dev in ide_init_port_hw().

* Update au1xxx-ide.c, sgiioc4.c, pmac.c and setup-pci.c accordingly.

v2:

* Update scc_pata.c.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:40 +02:00
Bartlomiej Zolnierkiewicz
63b51c6d1d ide: make ide_hwifs[] static
Move ide_hwifs[] from ide.c to ide-probe.c and make it static.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:40 +02:00
Bartlomiej Zolnierkiewicz
9ad5409375 ide: move PIO blacklist to ide-pio-blacklist.c
Move PIO blacklist to ide-pio-blacklist.c.

While at it:

- fix comment

- fix whitespace damage

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz
3e153cfb5e ide: remove no longer used ide_pio_timings[]
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz
c9d6c1a237 ide: move ide_pio_cycle_time() to ide-timings.c
All ide_pio_cycle_time() users already select CONFIG_IDE_TIMINGS
so move the function from ide-lib.c to ide-timings.c.

While at it:

- convert ide_pio_cycle_time() to use ide_timing_find_mode()

- cleanup ide_pio_cycle_time() a bit

There should be no functional changes caused by this patch.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:39 +02:00
Bartlomiej Zolnierkiewicz
f06ab3402a ide: convert ide-timing.h to ide-timings.c library (take 2)
* Don't include ide-timing.h in cs5535 and sis5513 host drivers
  (they don't need it currently).

* Convert ide-timing.h to ide-timings.c library and add CONFIG_IDE_TIMINGS
  config option to be selected by host drivers using the library.

While at it:

- fix ide_timing_find_mode() placement

v2:
* Add missing EXPORT_SYMBOLs. (Stephen Rothwell <sfr@canb.auug.org.au>)

There should be no functional changes caused by this patch.

Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:37 +02:00
Bartlomiej Zolnierkiewicz
3be53f3f21 ide: move some bits from ide-timing.h to <linux/ide.h>
Move struct ide_timing and IDE_TIMING_* defines to <linux/ide.h>
from drivers/ide/ide-timing.h.

While at it:

- use u8/u16 instead of short for struct ide_timing fields

- use enum for IDE_TIMING_*

There should be no functional changes caused by this patch.

Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-16 20:33:36 +02:00
Ingo Molnar
4bb689eee1 x86: paravirt spinlocks, !CONFIG_SMP build fixes
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:53 +02:00
Jeremy Fitzhardinge
2d9e1e2f58 xen: implement Xen-specific spinlocks
The standard ticket spinlocks are very expensive in a virtual
environment, because their performance depends on Xen's scheduler
giving vcpus time in the order that they're supposed to take the
spinlock.

This implements a Xen-specific spinlock, which should be much more
efficient.

The fast-path is essentially the old Linux-x86 locks, using a single
lock byte.  The locker decrements the byte; if the result is 0, then
they have the lock.  If the lock is negative, then locker must spin
until the lock is positive again.

When there's contention, the locker spin for 2^16[*] iterations waiting
to get the lock.  If it fails to get the lock in that time, it adds
itself to the contention count in the lock and blocks on a per-cpu
event channel.

When unlocking the spinlock, the locker looks to see if there's anyone
blocked waiting for the lock by checking for a non-zero waiter count.
If there's a waiter, it traverses the per-cpu "lock_spinners"
variable, which contains which lock each CPU is waiting on.  It picks
one CPU waiting on the lock and sends it an event to wake it up.

This allows efficient fast-path spinlock operation, while allowing
spinning vcpus to give up their processor time while waiting for a
contended lock.

[*] 2^16 iterations is threshold at which 98% locks have been taken
according to Thomas Friebel's Xen Summit talk "Preventing Guests from
Spinning Around".  Therefore, we'd expect the lock and unlock slow
paths will only be entered 2% of the time.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:53 +02:00
Jeremy Fitzhardinge
8efcbab674 paravirt: introduce a "lock-byte" spinlock implementation
Implement a version of the old spinlock algorithm, in which everyone
spins waiting for a lock byte.  In order to be compatible with the
ticket-lock's use of a zero initializer, this uses the convention of
'0' for unlocked and '1' for locked.

This algorithm is much better than ticket locks in a virtual
envionment, because it doesn't interact badly with the vcpu scheduler.
If there are multiple vcpus spinning on a lock and the lock is
released, the next vcpu to be scheduled will take the lock, rather
than cycling around until the next ticketed vcpu gets it.

To use this, you must call paravirt_use_bytelocks() very early, before
any spinlocks have been taken.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:53 +02:00
Jeremy Fitzhardinge
74d4affde8 x86/paravirt: add hooks for spinlock operations
Ticket spinlocks have absolutely ghastly worst-case performance
characteristics in a virtual environment.  If there is any contention
for physical CPUs (ie, there are more runnable vcpus than cpus), then
ticket locks can cause the system to end up spending 90+% of its time
spinning.

The problem is that (v)cpus waiting on a ticket spinlock will be
granted access to the lock in strict order they got their tickets.  If
the hypervisor scheduler doesn't give the vcpus time in that order,
they will burn timeslices waiting for the scheduler to give the right
vcpu some time.  In the worst case it could take O(n^2) vcpu scheduler
timeslices for everyone waiting on the lock to get it, not counting
new cpus trying to take the lock while the log-jam is sorted out.

These hooks allow a paravirt backend to replace the spinlock
implementation.

At the very least, this could revert the implementation back to the
old lock algorithm, which allows the next scheduled vcpu to take the
lock, and has basically fairly good performance.

It also allows the spinlocks to take advantages of the hypervisor
features to make locks more efficient (spin and block, for example).

The cost to native execution is an extra direct call when using a
spinlock function.  There's no overhead if CONFIG_PARAVIRT is turned
off.

The lock structure is fixed at a single "unsigned int", initialized to
zero, but the spinlock implementation can use it as it wishes.

Thanks to Thomas Friebel's Xen Summit talk "Preventing Guests from
Spinning Around" for pointing out this problem.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <clameter@linux-foundation.org>
Cc: Petr Tesarik <ptesarik@suse.cz>
Cc: Virtualization <virtualization@lists.linux-foundation.org>
Cc: Xen devel <xen-devel@lists.xensource.com>
Cc: Thomas Friebel <thomas.friebel@amd.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:15:52 +02:00
Jeremy Fitzhardinge
6a52e4b1cd x86_64: further cleanup of 32-bit compat syscall mechanisms
AMD only supports "syscall" from 32-bit compat usermode.
Intel and Centaur(?) only support "sysenter" from 32-bit compat usermode.

Set the X86 feature bits accordingly, and set up the vdso in
accordance with those bits.  On the offchance we run on in a 64-bit
environment which supports neither syscall nor sysenter from 32-bit
mode, then fall back to the int $0x80 vdso.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-07-16 11:08:27 +02:00
Ingo Molnar
9c8a442044 xen64: fix !HVC_XEN build dependency
fix:

arch/x86/xen/built-in.o: In function `set_page_prot':
enlighten.c:(.text+0x111d): undefined reference to `xen_raw_printk'
arch/x86/xen/built-in.o: In function `xen_start_kernel':
: undefined reference to `xen_raw_console_write'
arch/x86/xen/built-in.o: In function `xen_start_kernel':
: undefined reference to `xen_raw_console_write'

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:06:48 +02:00
Jeremy Fitzhardinge
c24481e9da xen64: save lots of registers
The Xen hypercall interface is allowed to trash any or all of the
argument registers, so we need to be careful that the kernel state
isn't damaged.  On 32-bit kernels, the hypercall parameter registers
same as a regparm function call, so we've got away without explicit
clobbering so far.  The 64-bit ABI defines lots of caller-save
registers, so save them all for safety.  We can trim this set later by
re-distributing the responsibility for saving all these registers.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:05:23 +02:00
Jeremy Fitzhardinge
c05f1cfaba xen64: implement 64-bit update_descriptor
64-bit hypercall interface can pass a maddr in one argument rather
than splitting it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:05:09 +02:00
Eduardo Habkost
45eb0d8898 Xen64: HYPERVISOR_set_segment_base() implementation
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:03:31 +02:00
Jeremy Fitzhardinge
88459d4c7e xen64: register callbacks in arch-independent way
Use callback_op hypercall to register callbacks in a 32/64-bit
independent way (64-bit doesn't need a code segment, but that detail
is hidden in XEN_CALLBACK).

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:03:01 +02:00
Jeremy Fitzhardinge
ce803e705f xen64: use arbitrary_virt_to_machine for xen_set_pmd
When building initial pagetables in 64-bit kernel the pud/pmd pointer may
be in ioremap/fixmap space, so we need to walk the pagetable to look up the
physical address.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:01:17 +02:00
Jeremy Fitzhardinge
084a2a4e76 xen64: early mapping setup
Set up the initial pagetables to map the kernel mapping into the
physical mapping space.  This makes __va() usable, since it requires
physical mappings.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 11:00:07 +02:00
Jeremy Fitzhardinge
5b09b2876e x86_64: add workaround for no %gs-based percpu
As a stopgap until Mike Travis's x86-64 gs-based percpu patches are
ready, provide workaround functions for x86_read/write_percpu for
Xen's use.

Specifically, this means that we can't really make use of vcpu
placement, because we can't use a single gs-based memory access to get
to vcpu fields.  So disable all that for now.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:58:13 +02:00
Jeremy Fitzhardinge
f6e587325b xen64: add extra pv_mmu_ops
We need extra pv_mmu_ops for 64-bit, to deal with the extra level of
pagetable.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:57:16 +02:00
Jeremy Fitzhardinge
e74359028d xen64: fix calls into hypercall page
The 64-bit calling convention for hypercalls uses different registers
from 32-bit.  Annoyingly, gcc's asm syntax doesn't have a way to
specify one of the extra numeric reigisters in a constraint, so we
must use explicitly placed register variables.  Given that we have to
do it for some args, may as well do it for all.

Also fix syntax gcc generates for the call instruction itself.  We
need a plain direct call, but the asm expansion which works on 32-bit
generates a rip-relative addressing mode in 64-bit, which is treated
as an indirect call.  The alternative is to pass the hypercall page
offset into the asm, and have it add it to the hypercall page start
address to generate the call.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:57:00 +02:00
Jeremy Fitzhardinge
ca15f20f11 xen: fix 64-bit hypercall variants
64-bit guests can pass 64-bit quantities in a single argument,
so fix up the hypercalls.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:56:46 +02:00
Jeremy Fitzhardinge
48b5db2062 xen64: define asm/xen/interface for 64-bit
Copy 64-bit definitions of various interface structures into place.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:56:18 +02:00
Isaku Yamahata
ad55db9fed xen: add xen_arch_resume()/xen_timer_resume hook for ia64 support
add xen_timer_resume() hook.

Timer resume should be done after event channel is resumed.
add xen_arch_resume() hook when ipi becomes usable after resume.
After resume, some cpu specific resource must be reinitialized
on ia64 that can't be set by another cpu.

However available hooks is run once on only one cpu so that ipi has
to be used.

During stop_machine_run() ipi can't be used because interrupt is masked.
So add another hook after stop_machine_run().
Another approach might be use resume hook which is run by
device_resume(). However device_resume() may be executed on
suspend error recovery path.

So it is necessary to determine whether it is executed on real resume path
or error recovery path.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:55:50 +02:00
Jeremy Fitzhardinge
7c33b1e6ee x86_64: unstatic get_local_pda
This allows Xen's xen_cpu_up() to allocate a pda for the new CPU.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:55:07 +02:00
Eduardo Habkost
a312b37b2a x86/paravirt: call paravirt_pagetable_setup_{start, done}
Call paravirt_pagetable_setup_{start,done}

These paravirt_ops functions were not being called on x86_64.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 10:53:43 +02:00
Linus Torvalds
45158894d4 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (249 commits)
  powerpc: Fix pte_update for CONFIG_PTE_64BIT and !PTE_ATOMIC_UPDATES
  powerpc: Fix a build problem on ppc32 with new DMA_ATTRs
  ibm_newemac: Add MII mode support to the EMAC RGMII bridge.
  powerpc: Don't spin on sync instruction at boot time
  powerpc: Add VSX load/store alignment exception handler
  powerpc: fix giveup_vsx to save registers correctly
  powerpc: support for latencytop
  powerpc: Remove unnecessary condition when sanity-checking WIMG bits
  powerpc: Add PPC_FEATURE_PSERIES_PERFMON_COMPAT
  powerpc: Add driver for Barrier Synchronization Register
  powerpc: mman.h export fixups
  powerpc/fsl: update crypto node definition and device tree instances
  powerpc/fsl: Refactor device bindings
  powerpc/85xx: Minor fixes for 85xxds and 8536ds board.
  powerpc: Add 82xx/83xx/86xx to 6xx Multiplatform
  powerpc/85xx: publish of device for cds platforms
  powerpc/booke: don't reinitialize time base
  powerpc/86xx: Refactor pic init
  powerpc/CPM: Add i2c pins to dts and board setup
  cpm_uart: Support uart_wait_until_sent()
  ...
2008-07-15 19:04:58 -07:00
Linus Torvalds
89a93f2f48 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (102 commits)
  [SCSI] scsi_dh: fix kconfig related build errors
  [SCSI] sym53c8xx: Fix bogus sym_que_entry re-implementation of container_of
  [SCSI] scsi_cmnd.h: remove double inclusion of linux/blkdev.h
  [SCSI] make struct scsi_{host,target}_type static
  [SCSI] fix locking in host use of blk_plug_device()
  [SCSI] zfcp: Cleanup external header file
  [SCSI] zfcp: Cleanup code in zfcp_erp.c
  [SCSI] zfcp: zfcp_fsf cleanup.
  [SCSI] zfcp: consolidate sysfs things into one file.
  [SCSI] zfcp: Cleanup of code in zfcp_aux.c
  [SCSI] zfcp: Cleanup of code in zfcp_scsi.c
  [SCSI] zfcp: Move status accessors from zfcp to SCSI include file.
  [SCSI] zfcp: Small QDIO cleanups
  [SCSI] zfcp: Adapter reopen for large number of unsolicited status
  [SCSI] zfcp: Fix error checking for ELS ADISC requests
  [SCSI] zfcp: wait until adapter is finished with ERP during auto-port
  [SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver
  [SCSI] sg: Add target reset support
  [SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC
  [SCSI] sd: Move scsi_disk() accessor function to sd.h
  ...
2008-07-15 18:58:04 -07:00
Benjamin Herrenschmidt
84c3d4aaec Merge commit 'origin/master'
Manual merge of:

	arch/powerpc/Kconfig
	arch/powerpc/kernel/stacktrace.c
	arch/powerpc/mm/slice.c
	arch/ppc/kernel/smp.c
2008-07-16 11:07:59 +10:00
Trond Myklebust
e89e896d31 Merge branch 'devel' into next
Conflicts:

	fs/nfs/file.c

Fix up the conflict with Jon Corbet's bkl-removal tree
2008-07-15 18:34:16 -04:00
Ingo Molnar
82638844d9 Merge branch 'linus' into cpus4096
Conflicts:

	arch/x86/xen/smp.c
	kernel/sched_rt.c
	net/iucv/iucv.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-16 00:29:07 +02:00
Chuck Lever
c2e1b09ff2 SUNRPC: Support registering IPv6 interfaces with local rpcbind daemon
Introduce a new API to register RPC services on IPv6 interfaces to allow
the NFS server and lockd to advertise on IPv6 networks.

Unlike rpcb_register(), the new rpcb_v4_register() function uses rpcbind
protocol version 4 to contact the local rpcbind daemon.  The version 4
SET/UNSET procedures allow services to register address families besides
AF_INET, register at specific network interfaces, and register transport
protocols besides UDP and TCP.  All of this functionality is exposed via
the new rpcb_v4_register() kernel API.

A user-space rpcbind daemon implementation that supports version 4 of the
rpcbind protocol is required in order to make use of this new API.

Note that rpcbind version 3 is sufficient to support the new rpcbind
facilities listed above, but most extant implementations use version 4.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2008-07-15 18:08:55 -04:00
Linus Torvalds
7e2225d860 Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (54 commits)
  [MIPS] Remove mips_machtype for LASAT machines
  [MIPS] Remove mips_machtype from EMMA2RH machines
  [MIPS] Remove mips_machtype from ARC based machines
  [MIPS] MTX-1 flash partition setup move to platform devices registration
  [MIPS] TXx9: cleanup and fix some sparse warnings
  [MIPS] TXx9: rename asm-mips/mach-jmr3927 to asm-mips/mach-tx39xx
  [MIPS] remove machtype for group Toshiba
  [MIPS] separate rbtx4927_time_init() and rbtx4937_time_init()
  [MIPS] separate rbtx4927_arch_init() and rbtx4937_arch_init()
  [MIPS] txx9_cpu_clock setup move to rbtx4927_time_init()
  [MIPS] txx9_board_vec set directly without mips_machtype
  [MIPS] IP22: Add platform device for Indy volume buttons
  [MIPS] cmbvr4133: Remove support
  [MIPS] remove wrppmc_machine_power_off()
  [MIPS] replace inline assembler to cpu_wait()
  [MIPS] IP22/28: Add platform devices for HAL2
  [MIPS] TXx9: Update and merge defconfigs
  [MIPS] TXx9: Make single kernel can support multiple boards
  [MIPS] TXx9: Update defconfigs
  [MIPS] TXx9: Reorganize PCI code
  ...
2008-07-15 15:01:29 -07:00
Ingo Molnar
1e09481365 Merge branch 'linus' into core/softlockup
Conflicts:

	kernel/softlockup.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-15 23:12:58 +02:00
Linus Torvalds
59190f4213 Merge branch 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'generic-ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
  generic-ipi: more merge fallout
  generic-ipi: merge fix
  x86, visws: use mach-default/entry_arch.h
  x86, visws: fix generic-ipi build
  generic-ipi: fixlet
  generic-ipi: fix s390 build bug
  generic-ipi: fix linux-next tree build failure
  fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
  fix: "smp_call_function: get rid of the unused nonatomic/retry argument"
  fix "smp_call_function: get rid of the unused nonatomic/retry argument"
  on_each_cpu(): kill unused 'retry' parameter
  smp_call_function: get rid of the unused nonatomic/retry argument
  sh: convert to generic helpers for IPI function calls
  parisc: convert to generic helpers for IPI function calls
  mips: convert to generic helpers for IPI function calls
  m32r: convert to generic helpers for IPI function calls
  arm: convert to generic helpers for IPI function calls
  alpha: convert to generic helpers for IPI function calls
  ia64: convert to generic helpers for IPI function calls
  powerpc: convert to generic helpers for IPI function calls
  ...

Fix trivial conflicts due to rcu updates in kernel/rcupdate.c manually
2008-07-15 14:12:03 -07:00
Linus Torvalds
64fd52a520 Merge branch 'core/rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core/rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (23 commits)
  rcu classic: update qlen when cpu offline
  rcu: make rcutorture even more vicious: invoke RCU readers from irq handlers (timers)
  rcu: make quiescent rcutorture less power-hungry
  rcu, rcutorture: make quiescent rcutorture less power-hungry
  rcu: make rcutorture more vicious: reinstate boot-time testing
  rcu: make rcutorture more vicious: add stutter feature
  rcutorture: WARN_ON_ONCE(1) when detecting an error
  rcu: remove unused field struct rcu_data::rcu_tasklet
  Revert "prohibit rcutorture from being compiled into the kernel"
  rcu: fix nf_conntrack_helper.c build bug
  rculist.h: fix include in net/netfilter/nf_conntrack_netlink.c
  rcu: remove duplicated include in kernel/rcupreempt.c
  rcu: remove duplicated include in kernel/rcupreempt_trace.c
  RCU, rculist.h: fix list iterators
  rcu: fix rcu_try_flip_waitack_needed() to prevent grace-period stall
  rculist.h: use the rcu API
  rcu: split list.h and move rcu-protected lists into rculist.h
  sched: 1Q08 RCU doc update, add call_rcu_sched()
  rcu: add call_rcu_sched() and friends to rcutorture
  rcu: add rcu_barrier_sched() and rcu_barrier_bh()
  ...
2008-07-15 13:59:31 -07:00
Sebastian Siewior
fe1a6875fc mm: fix build on non-mmu machines
Commit 1ea0704e0d aka "mm: add a ptep_modify_prot transaction abstraction"

caused:

|  CC      init/main.o
|In file included from include2/asm/pgtable.h:68,
|                 from /home/bigeasy/git/linux-2.6-m68k/include/linux/mm.h:39,
|                 from include2/asm/uaccess.h:8,
|                 from /home/bigeasy/git/linux-2.6-m68k/include/linux/poll.h:13,
|                 from /home/bigeasy/git/linux-2.6-m68k/include/linux/rtc.h:113,
|                 from /home/bigeasy/git/linux-2.6-m68k/include/linux/efi.h:19,
|                 from /home/bigeasy/git/linux-2.6-m68k/init/main.c:43:
|/linux-2.6/include/asm-generic/pgtable.h: In function '__ptep_modify_prot_start':
|/linux-2.6/include/asm-generic/pgtable.h:209: error: implicit declaration of function 'ptep_get_and_clear'
|/linux-2.6/include/asm-generic/pgtable.h:209: error: incompatible types in return
|/linux-2.6/include/asm-generic/pgtable.h: In function '__ptep_modify_prot_commit':
|/linux-2.6/include/asm-generic/pgtable.h:220: error: implicit declaration of function 'set_pte_at'
|make[2]: *** [init/main.o] Error 1
|make[1]: *** [init] Error 2
|make: *** [sub-make] Error 2

on my m68knommu box.

Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sebastian Siewior <bigeasy@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-15 13:58:40 -07:00
Chuck Lever
367c8c7bd9 lockd: Pass "struct sockaddr *" to new failover-by-IP function
Pass a more generic socket address type to nlmsvc_unlock_all_by_ip() to
allow for future support of IPv6.  Also provide additional sanity
checking in failover_unlock_ip() when constructing the server's IP
address.

As an added bonus, provide clean kerneldoc comments on related NLM
interfaces which were recently added.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-07-15 16:11:29 -04:00
Ingo Molnar
1a781a777b Merge branch 'generic-ipi' into generic-ipi-for-linus
Conflicts:

	arch/powerpc/Kconfig
	arch/s390/kernel/time.c
	arch/x86/kernel/apic_32.c
	arch/x86/kernel/cpu/perfctr-watchdog.c
	arch/x86/kernel/i8259_64.c
	arch/x86/kernel/ldt.c
	arch/x86/kernel/nmi_64.c
	arch/x86/kernel/smpboot.c
	arch/x86/xen/smp.c
	include/asm-x86/hw_irq_32.h
	include/asm-x86/hw_irq_64.h
	include/asm-x86/mach-default/irq_vectors.h
	include/asm-x86/mach-voyager/irq_vectors.h
	include/asm-x86/smp.h
	kernel/Makefile

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-15 21:55:59 +02:00
Linus Torvalds
22a37bcb78 Merge branch 'sbp2-spindown' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
* 'sbp2-spindown' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
  ieee1394: sbp2: spin disks down on suspend and shutdown
  firewire: fw-sbp2: spin disks down on suspend and shutdown
  ieee1394: sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares
  firewire: fw-sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares
  scsi: sd: optionally set power condition in START STOP UNIT
2008-07-15 12:39:44 -07:00
Ingo Molnar
6c9fcaf2ee Merge branch 'core/rcu' into core/rcu-for-linus 2008-07-15 21:10:12 +02:00
Jeff Layton
6cde4de807 lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_lock
nlmsvc_lock calls nlmsvc_lookup_host to find a nlm_host struct. The
callers of this function, however, call nlmsvc_retrieve_args or
nlm4svc_retrieve_args, which also return a nlm_host struct.

Change nlmsvc_lock to take a host arg instead of calling
nlmsvc_lookup_host itself and change the callers to pass a pointer to
the nlm_host they've already found.

Since nlmsvc_testlock() now just uses the caller's reference, we no
longer need to get or release it.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-07-15 14:53:33 -04:00
Jeff Layton
8f920d5e29 lockd: eliminate duplicate nlmsvc_lookup_host call from nlmsvc_testlock
nlmsvc_testlock calls nlmsvc_lookup_host to find a nlm_host struct. The
callers of this functions, however, call nlmsvc_retrieve_args or
nlm4svc_retrieve_args, which also return a nlm_host struct.

Change nlmsvc_testlock to take a host arg instead of calling
nlmsvc_lookup_host itself and change the callers to pass a pointer to
the nlm_host they've already found.

We take a reference to host in the place where nlmsvc_testlock()
previous did a new lookup, so the reference counting is unchanged from
before.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2008-07-15 14:26:52 -04:00
Linus Torvalds
b312bf359e Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  AHCI: Remove an unnecessary flush from ahci_qc_issue
  AHCI: speed up resume
  [libata] Add support for VPD page b1
  ata: endianness annotations in pata drivers
  libata-eh: update atapi_eh_request_sense() to take @dev instead of @qc
  [libata] sata_svw: update code comments relating to data corruption
  libata/ahci: enclosure management support
  libata: improve EH internal command timeout handling
  libata: use ULONG_MAX to terminate reset timeout table
  libata: improve EH retry delay handling
  libata: consistently use msecs for time durations
2008-07-15 11:18:10 -07:00
Linus Torvalds
dc221eae08 Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: (56 commits)
  i2c: Add detection capability to new-style drivers
  i2c: Call client_unregister for new-style devices too
  i2c: Clean up old chip drivers
  i2c-ibm_iic: Register child nodes
  i2c: New-style EEPROM driver using device IDs
  i2c: Export the i2c_bus_type symbol
  i2c-au1550: Fix PM support
  i2c-dev: Delete empty detach_client callback
  i2c: Drop stray references to lm_sensors
  i2c: Check for ACPI resource conflicts
  i2c-ocores: basic PM support
  i2c-sibyte: SWARM I2C board initialization
  i2c-i801: Fix handling of error conditions
  i2c-i801: Rename local variable temp to status
  i2c-i801: Properly report bus arbitration loss
  i2c-i801: Remove verbose debugging messages
  i2c-algo-pcf: Drop unused struct members
  i2c-algo-pcf: Multi-master lost-arbitration improvement
  i2c: Deprecate the legacy gpio drivers
  i2c-pxa: Initialize early
  ...
2008-07-15 11:16:05 -07:00
Linus Torvalds
98339cbd36 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (80 commits)
  ide-floppy: fix unfortunate function naming
  ide-tape: unify idetape_create_read/write_cmd
  ide: add ide_pc_intr() helper
  ide-{floppy,scsi}: read Status Register before stopping DMA engine
  ide-scsi: add more debugging to idescsi_pc_intr()
  ide-scsi: use pc->callback
  ide-floppy: add more debugging to idefloppy_pc_intr()
  ide-tape: always log debug info in idetape_pc_intr() if debugging is enabled
  ide-tape: add ide_tape_io_buffers() helper
  ide-tape: factor out DSC handling from idetape_pc_intr()
  ide-{floppy,tape}: move checking of ->failed_pc to ->callback
  ide: add ide_issue_pc() helper
  ide: add PC_FLAG_DRQ_INTERRUPT pc flag
  ide-scsi: move idescsi_map_sg() call out from idescsi_issue_pc()
  ide: add ide_transfer_pc() helper
  ide-scsi: set drive->scsi flag for devices handled by the driver
  ide-{cd,floppy,tape}: remove checking for drive->scsi
  ide: add PC_FLAG_ZIP_DRIVE pc flag
  ide-tape: factor out waiting for good ireason from idetape_transfer_pc()
  ide-tape: set PC_FLAG_DMA_IN_PROGRESS flag in idetape_transfer_pc()
  ...
2008-07-15 11:15:36 -07:00
Bartlomiej Zolnierkiewicz
646c0cb6c4 ide: add ide_pc_intr() helper
* ide-tape.c: add 'drive' argument to idetape_update_buffers().

* Add generic ide_pc_intr() helper to ide-atapi.c and then
  convert ide-{floppy,tape,scsi} device drivers to use it.

* ide-tape.c: remove no longer needed DBG_PC_INTR.

There should be no functional changes caused by this patch
(unless the debugging is explicitely compiled in).

Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:22:03 +02:00
Bartlomiej Zolnierkiewicz
6bf1641ca1 ide: add ide_issue_pc() helper
Add generic ide_issue_pc() helper to ide-atapi.c and then
convert ide-{floppy,tape,scsi} device drivers to use it.

There should be no functional changes caused by this patch.

Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:22:00 +02:00
Bartlomiej Zolnierkiewicz
28c7214bd8 ide: add PC_FLAG_DRQ_INTERRUPT pc flag
Add PC_FLAG_DRQ_INTERRUPT pc flag, set it in ide*_do_request()
and check for it (instead of checking for IDE*_FLAG_DRQ_INTERRUPT)
in ide*_issue_pc().  This is a preparation for adding generic
ide_issue_pc() helper.

There should be no functional changes caused by this patch.

Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:59 +02:00
Bartlomiej Zolnierkiewicz
594c16d8dd ide: add ide_transfer_pc() helper
* Add ide-atapi.c file for generic ATAPI support together with
  CONFIG_IDE_ATAPI config option.

* Add generic ide_transfer_pc() helper to ide-atapi.c and then
  convert ide-{floppy,tape,scsi} device drivers to use it.

There should be no functional changes caused by this patch.

Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:58 +02:00
Bartlomiej Zolnierkiewicz
5d41893c0f ide: add PC_FLAG_ZIP_DRIVE pc flag
Add PC_FLAG_ZIP_DRIVE pc flag, set it in idefloppy_do_request()
and check for it (instead of checking for IDEFLOPPY_FLAG_ZIP_DRIVE)
in idefloppy_transfer_pc().  This is a preparation for adding
generic ide_transfer_pc() helper.

There should be no functional changes caused by this patch.

Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:57 +02:00
Bartlomiej Zolnierkiewicz
5e33109582 ide-{floppy,tape}: PC_FLAG_DMA_RECOMMENDED -> PC_FLAG_DMA_OK
* Use PC_FLAG_DMA_OK flag instead of PC_FLAG_DMA_RECOMMENDED one.

* Remove no longer used PC_FLAG_DMA_RECOMMENDED flag.

There should be no functional changes caused by this patch.

Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:56 +02:00
Bartlomiej Zolnierkiewicz
1b06e92aa0 ide-{floppy,tape}: merge pc->idefloppy_callback and pc->idetape_callback
Merge pc->idefloppy_callback and pc->idetape_callback into pc->callback.

There should be no functional changes caused by this patch.

Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:56 +02:00
Bartlomiej Zolnierkiewicz
92f5daff2b ide-tape: make pc->idetape_callback void
There should be no functional changes caused by this patch.

Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:55 +02:00
FUJITA Tomonori
63f5abb095 ide: remove action argument in ide_do_drive_cmd
ide_do_drive_cmd is called only with ide_preempt action argument. So
we can remove the action argument in ide_do_drive_cmd and ide_action_t
typedef.

This patch also includes two minor cleanups: 1) ide_do_drive_cmd
always succeeds so we don't need the return value; 2) the callers use
blk_rq_init before ide_do_drive_cmd so there is no need to initialize
rq->errors.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:51 +02:00
Bartlomiej Zolnierkiewicz
ff07488346 ide: remove drive->ctl
Remove drive->ctl (it is always equal to 0x08 after init time).

While at it:

* Use ATA_DEVCTL_OBS define.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:50 +02:00
Bartlomiej Zolnierkiewicz
0fd04dcc2e ide: use ->OUTBSYNC in ide_set_irq()
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:50 +02:00
Bartlomiej Zolnierkiewicz
f8c4bd0ab2 ide: pass 'hwif *' instead of 'drive *' to ->OUTBSYNC method
There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:49 +02:00
Bartlomiej Zolnierkiewicz
1357214461 ide: remove ->mmio flag from ide_hwif_t
Since scc_pata host driver no longer uses IDE PCI layer / ide_dma_setup()
and all other ->mmio users set also IDE_HFLAG_MMIO host flag we can safely
remove ->mmio flag.

There should be no functional changes caused by this patch.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:49 +02:00
Bartlomiej Zolnierkiewicz
ed4af48fd6 ide: move IRQ unmasking out from ->tf_load method
Move IRQ unmasking out from ->tf_load method to its users.

There should be no functional changes caused by this patch
(SELECT_MASK() is NOP except for hpt366, icside and sgiioc4).

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:48 +02:00
Bartlomiej Zolnierkiewicz
9a410e79b5 ide: remove IDE_TFLAG_NO_SELECT_MASK taskfile flag
Always call SELECT_MASK(..., 0) in ide_tf_load() (needs to be done
to match ide_set_irq(..., 1)) and then remove IDE_TFLAG_NO_SELECT_MASK
taskfile flag.

This change should only affect hpt366 and icside host drivers since
->maskproc(..., 0) for sgiioc4 is equivalent to ide_set_irq(..., 1).

Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:48 +02:00
Bartlomiej Zolnierkiewicz
931ee0dc5c ide: remove obsoleted "ide=" kernel parameters
* Remove obsoleted "ide=" kernel parameters.

* Remove no longer needed:
  - ide_setup()
  - parse_options()
  - __setup("", ...)
  - module_param(options, ...)

* Use module_{init,exit}() for MODULE=y case and remove MODULE ifdef.

* Make ide_*acpi* and ide_doubler variables static.

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:47 +02:00
Bartlomiej Zolnierkiewicz
30e5ee4d1a ide: remove obsoleted "idebus=" kernel parameter
* Remove obsoleted "idebus=" kernel parameter.

* Remove no longer needed ide_system_bus_speed() and system_bus_clock()
  (together with idebus_parameter and system_bus_speed variables).

Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:46 +02:00
FUJITA Tomonori
681a561b7e block: unexport blk_end_sync_rq
All the users of blk_end_sync_rq has gone (they are converted to use
blk_execute_rq). This unexports blk_end_sync_rq.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Borislav Petkov <petkovbb@gmail.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:45 +02:00
FUJITA Tomonori
124cafc5eb ide: remove ide_init_drive_cmd
ide_init_drive_cmd just calls blk_rq_init. This converts the users of
ide_init_drive_cmd to use blk_rq_init directly and removes
ide_init_drive_cmd.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Borislav Petkov <petkovbb@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
2008-07-15 21:21:44 +02:00
Thomas Bogendoerfer
b27418aa55 [MIPS] Remove mips_machtype for LASAT machines
This is the LASAT part of the mips_machtype removal.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:39 +01:00
Thomas Bogendoerfer
0b56fd8c7a [MIPS] Remove mips_machtype from EMMA2RH machines
This is the EMMA2RH part of the mips_machtype removal.

[Ralf: Fixed to the #error statements]

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:39 +01:00
Thomas Bogendoerfer
c660729501 [MIPS] Remove mips_machtype from ARC based machines
This is the ARC part of the mips_machtype removal.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:38 +01:00
Atsushi Nemoto
7b22609442 [MIPS] TXx9: cleanup and fix some sparse warnings
* Do not return void value
* Make some functions static
* Do not include unnecessary bootinfo.h

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:38 +01:00
Atsushi Nemoto
4c642f3f5e [MIPS] TXx9: rename asm-mips/mach-jmr3927 to asm-mips/mach-tx39xx
Rename mach-jmr3927 directory to more proper name to make adding other
platforms easier.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:38 +01:00
Yoichi Yuasa
6e68665e51 [MIPS] remove machtype for group Toshiba
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Acked-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:38 +01:00
Yoichi Yuasa
efff4ae259 [MIPS] cmbvr4133: Remove support
It cannot be built for a long time and nobody maintains it.

Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:36 +01:00
Atsushi Nemoto
edcaf1a6a7 [MIPS] TXx9: Make single kernel can support multiple boards
Make single kernel can be used on RBTX4927/37/38.  Also make
some SoC-specific code independent from board-specific code.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:35 +01:00
Atsushi Nemoto
89d63fe179 [MIPS] TXx9: Reorganize PCI code
Split out PCIC dependent code and SoC dependent code from board dependent
code.  Now TX4927 PCIC code is independent from TX4927/TX4938 SoC code.
Also fix some build problems on CONFIG_PCI=n.

As a bonus, "FPCIB0 Backplane Support" is available for all TX39/TX49 boards
and PCI66 support is available for all TX49 boards.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:35 +01:00
Atsushi Nemoto
22b1d707ff [MIPS] TXx9: Reorganize code
Move arch/mips/{jmr3927,tx4927,tx4938} into arch/mips/txx9/ tree.
This will help more code sharing and maintainance.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:35 +01:00
Ralf Baechle
315806cb19 [MIPS] Malta: Cleanup organization of code into directories.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:34 +01:00
Ralf Baechle
1398ddb2eb [MIPS] SEAD: Remove support code.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:33 +01:00
Ralf Baechle
2157bc6871 [MIPS] Atlas: Remove support code.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:33 +01:00
Atsushi Nemoto
b29eee4935 [MIPS] rbtx4927: misc cleanups
* Merge tx4927_pci.h into tx4927.h
* Kill (broken) external PCI clock frequency reporting
* Kill unnecessary wbflush()
* Kill unnecessary includes
* Kill debug garbages

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:32 +01:00
Atsushi Nemoto
af3e69cfc9 [MIPS] Declare some pci variables in header file
Declare pci_probe_only, etc. in asm-mips/pci.h file.  This will fix
some sparse warnings.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:32 +01:00
Thomas Bogendoerfer
7a2852e49f [MIPS] IP28: switch to "normal" mode after PROM no longer needed
SGI-IP28 is running in so called slow mode, when kernel is started
from the PROM. PROM calls must be done in slow mode otherwise the
PROM will issue an error. To get better memory performance we now
switch to normal mode, when the PROM is no longer needed.

Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:32 +01:00
David Daney
94daeb9069 [MIPS] Fix asm constraints for 'ins' instructions.
The third operand to 'ins' must be a constant int, not a register.

[Ralf: The bug was actually intensional.  Some versions used to throw an
error under certain circumstances for code like:

static inline void f(unsigned nr, unsigned *p)
{
	unsigned short bit = nr & 5;

	if (__builtin_constant_p(bit)) {
		__asm__ __volatile__ ("  foo %0, %1" : "=m" (*p) : "i" (bit));
  	} else {
		/* Do something else. */
	}
}

because gcc was not able to figure out that the "i" constraint was possibly
at the early stage when the constraint are getting verified.  The solution
was using "ri" instead of "i".  The "ri" would keep gcc happy but in the
end for code generation always the "i" constraint would be satisfied.  The
problem afair originally appeared in the i386 io.h and also hit it's mips
equivalent.  From there the workaround spread to many of the inline
assembler functions.]

Signed-off-by: David Daney <ddaney@avtrex.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:30 +01:00
Maciej W. Rozycki
043ebd6c9d [MIPS] DECstation: Document more MB ASIC register bits
Document a few more register bits provided by the MB ASIC used on R4000SC
(KN04) and R4400SC (KN05) CPU daughtercards with the DECstation.  

 Reverse-engineered and not documented anywhere else to the best of my
knowledge.  Bit names appended to the last underscore the same as reported
by the firmware in register dumps.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:30 +01:00
Ralf Baechle
2957c9e61e [MIPS] IRIX: Goodbye and thanks for all the fish
Never terribly functional or popular, plagued by hard to fix bugs the time
to say goodbye has more than arrived.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:30 +01:00
Manuel Lauss
997288517e [MIPS] Alchemy: remove unused MMC macros from db1x00 header.
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:30 +01:00
Dmitri Vorobiev
07cdb78436 [MIPS] fix sparse warning about setup_early_printk()
This patch fixes the following sparse warning:

<<<<<<<<

arch/mips/kernel/early_printk.c:35:13: warning: symbol 'setup_early_printk'
was not declared. Should it be static?

<<<<<<<<

The fix is to define a prototype of the setup_early_printk() function and
to include the appropriate header into arch/mips/kernel/early_printk.c.

[Ralf: Sorted includes again]

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:29 +01:00
Maciej W. Rozycki
c88a8b4ab0 [MIPS] Remove obsolete isa_slot_offset
The isa_slot_offset variable and its __ISA_IO_base macro is not used
anywhere anymore.  It does not look like a decent interface per today's
standards either.  Remove both including all places of initialization.

Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:29 +01:00
David Daney
cb11dfa024 [MIPS] Remove board_watchpoint_handler
It is not used anywhere in tree.

Signed-off-by: David Daney <ddaney@avtrex.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:29 +01:00
Chen, Huacai
2954c02a88 [MIPS] modify the MIPS CPU classfication
Signed-off-by: Huacai Chen <huacai.chen@intel.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2008-07-15 18:44:28 +01:00
Linus Torvalds
61d97f4fcf Merge branch 'genirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'genirq' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: remove extraneous checks in manage.c
  genirq: Expose default irq affinity mask (take 3)
2008-07-15 10:39:22 -07:00
Linus Torvalds
38c46578ff Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw:
  [GFS2] Fix GFS2's use of do_div() in its quota calculations
  [GFS2] Remove unused declaration
  [GFS2] Remove support for unused and pointless flag
  [GFS2] Replace rgrp "recent list" with mru list
  [GFS2] Allow local DF locks when holding a cached EX glock
  [GFS2] Fix delayed demote race
  [GFS2] don't call permission()
  [GFS2] Fix module building
  [GFS2] Glock documentation
  [GFS2] Remove all_list from lock_dlm
  [GFS2] Remove obsolete conversion deadlock avoidance code
  [GFS2] Remove remote lock dropping code
  [GFS2] kernel panic mounting volume
  [GFS2] Revise readpage locking
  [GFS2] Fix ordering of args for list_add
  [GFS2] trivial sparse lock annotations
  [GFS2] No lock_nolock
  [GFS2] Fix ordering bug in lock_dlm
  [GFS2] Clean up the glock core
2008-07-15 10:38:46 -07:00
Linus Torvalds
e7849f16c1 Merge branch 'core/topology' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core/topology' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  cputopology: always define CPU topology information, clean up
  cpu topology: always define CPU topology information
2008-07-15 10:32:39 -07:00
Linus Torvalds
1dc60c53d3 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: Fix compile error with CONFIG_AS_CFI=n
  Documentation: document debugpat commandline option
  x86: sanitize Kconfig
  x86, suspend, acpi: correct and add comments about Big Real Mode
  x86, suspend, acpi: enter Big Real Mode

Fixed trivial conflict in include/asm-x86/dwarf2.h due to just using
different names for "cfi_ignore" (vs "__cfi_ignore") macro.
2008-07-15 08:41:43 -07:00
Linus Torvalds
8d2567a620 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits)
  ext4: Documention update for new ordered mode and delayed allocation
  ext4: do not set extents feature from the kernel
  ext4: Don't allow nonextenst mount option for large filesystem
  ext4: Enable delalloc by default.
  ext4: delayed allocation i_blocks fix for stat
  ext4: fix delalloc i_disksize early update issue
  ext4: Handle page without buffers in ext4_*_writepage()
  ext4: Add ordered mode support for delalloc
  ext4: Invert lock ordering of page_lock and transaction start in delalloc
  mm: Add range_cont mode for writeback
  ext4: delayed allocation ENOSPC handling
  percpu_counter: new function percpu_counter_sum_and_set
  ext4: Add delayed allocation support in data=writeback mode
  vfs: add hooks for ext4's delayed allocation support
  jbd2: Remove data=ordered mode support using jbd buffer heads
  ext4: Use new framework for data=ordered mode in JBD2
  jbd2: Implement data=ordered mode handling via inodes
  vfs: export filemap_fdatawrite_range()
  ext4: Fix lock inversion in ext4_ext_truncate()
  ext4: Invert the locking order of page_lock and transaction start
  ...
2008-07-15 08:36:38 -07:00
Linus Torvalds
bcf559e385 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon: fixup issue with radeon and PAT support.
2008-07-15 08:35:28 -07:00
Linus Torvalds
97c7d1ea1f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (52 commits)
  IB/mlx4: Use kzalloc() for new QPs so flags are initialized to 0
  mlx4_core: Use MOD_STAT_CFG command to get minimal page size
  RDMA/cma: Simplify locking needed for serialization of callbacks
  RDMA/addr: Keep pointer to netdevice in struct rdma_dev_addr
  RDMA/cxgb3: Fixes for zero STag
  RDMA/core: Add local DMA L_Key support
  IB/mthca: Fix check of max_send_sge for special QPs
  IB/mthca: Use round_jiffies() for catastrophic error polling timer
  IB/mthca: Remove "stop" flag for catastrophic error polling timer
  IPoIB: Double default RX/TX ring sizes
  IPoIB/cm: Reduce connected mode TX object size
  IB/ipath: Use IEEE OUI for vendor_id reported by ibv_query_device()
  IPoIB: Use dev_set_mtu() to change mtu
  IPoIB: Use rtnl lock/unlock when changing device flags
  IPoIB: Get rid of ipoib_mcast_detach() wrapper
  IPoIB: Only set Q_Key once: after joining broadcast group
  IPoIB: Remove priv->mcast_mutex
  IPoIB: Remove unused IPOIB_MCAST_STARTED code
  RDMA/cxgb3: Set rkey field for new memory windows in iwch_alloc_mw()
  RDMA/nes: Get rid of ring_doorbell parameter of nes_post_cqp_request()
  ...
2008-07-15 08:01:15 -07:00
Chandra Seetharaman
fe9233fb69 [SCSI] scsi_dh: fix kconfig related build errors
Do not automatically "select" SCSI_DH for dm-multipath. If SCSI_DH
doesn't exist,just do not allow  hardware handlers to be used.

Handle SCSI_DH being a module also. Make sure it doesn't allow DM_MULTIPATH
to be compiled in when SCSI_DH is a module.

[jejb: added comment for Kconfig syntax]
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-07-15 09:16:43 -05:00
Benzi Zbit
62a7573ee9 sdio: fix the use of hard coded timeout value.
This adds reading and using of enable_timeout from the CIS

Signed-off-by: Benzi Zbit <benzi.zbit@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 15:47:03 +02:00
Kevin Winchester
3f1c38723e x86: Fix compile error with CONFIG_AS_CFI=n
AS      arch/x86/lib/csum-copy_64.o
arch/x86/lib/csum-copy_64.S: Assembler messages:
arch/x86/lib/csum-copy_64.S:48: Error: Macro `ignore' was already defined
make[1]: *** [arch/x86/lib/csum-copy_64.o] Error 1
make: *** [arch/x86/lib] Error 2

It appears that csum-copy_64.S and dwarf2.h both define an ignore macro. 
I would expect one of them can be renamed quite easily, unless they 
are references elsewhere. 

Caused-by-commit: 392a0fc96b
    x86: merge dwarf2 headers

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-15 15:30:29 +02:00
Pierre Ossman
23af60398a mmc: remove multiwrite capability
Relax requirements on host controllers and only require that they do not
report a transfer count than is larger than the actual one (i.e. a lower
value is okay). This is how many other parts of the kernel behaves so
upper layers should already be prepared to handle that scenario. This
gives us a performance boost on MMC cards.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:49 +02:00
Haavard Skinnemoen
7d2be0749a atmel-mci: Driver for Atmel on-chip MMC controllers
This is a driver for the MMC controller on the AP7000 chips from
Atmel. It should in theory work on AT91 systems too with some
tweaking, but since the DMA interface is quite different, it's not
entirely clear if it's worth merging this with the at91_mci driver.

This driver has been around for a while in BSPs and kernel sources
provided by Atmel, but this particular version uses the generic DMA
Engine framework (with the slave extensions) instead of an
avr32-only DMA controller framework.

This driver can also use PIO transfers when no DMA channels are
available, and for transfers where using DMA may be difficult or
impractical for some reason (e.g. the DMA setup overhead is usually
not worth it for very short transfers, and badly aligned buffers or
lengths are difficult to handle.)

Currently, the driver only support PIO transfers. DMA support has been
split out to a separate patch to hopefully make it easier to review.

The driver has been tested using mmc-block and ext3fs on several SD,
SDHC and MMC+ cards. Reads and writes work fine, with read transfer
rates up to 3.5 MiB/s on fast cards with debugging disabled.

The driver has also been tested using the mmc_test module on the same
cards. All tests except 7, 9, 15 and 17 succeed. The first two are
unsupported by all the cards I have, so I don't know if the driver
handles this correctly. The last two fail because the hardware flags a
Data CRC Error instead of a Data Timeout error. I'm not sure how to deal
with that.

Documentation for this controller can be found in many data sheets from
Atmel, including the AT32AP7000 data sheet which can be found here:

http://www.atmel.com/dyn/products/datasheets.asp?family_id=682

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:49 +02:00
Tomas Winkler
6d37333163 mmc: fix sdio_io sparse errors
This patch fixes sdio_io sparse errors.
This fix changes signature of API functions,
changing
unsigned char -> u8
unsigned short -> u16
unsigned long -> u32 - this was probably a bug in 64 bit platforms

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:48 +02:00
Ben Dooks
50a845700b MMC: S3C24XX: Add media presence test to request handling.
Ensure that we have physical media present before attempting to
send a request to a card. This ensures that we do not get flooded
by errors from commands that can never be completed timing out.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:48 +02:00
Ben Dooks
cf0984c8ed MMC: S3C24XX: Add support to invert write protect line
Support for inverting the sense of the MMC driver's write
protect detection line.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:47 +02:00
Ben Dooks
edb5a98e43 MMC: S3C24XX: Add platform data for MMC/SD driver
This patch adds platform data support to the s3mci driver.  This allows
flexible board-specific configuration of set_power, card detect and read only
pins.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:47 +02:00
Thomas Kleffel
be518018c6 MMC: S3C24XX MMC/SD driver.
This is the latest S3C MMC/SD driver by Thomas Kleffel
with cleanups as suggested by AKPM done by Ben Dooks.

Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Thomas Kleffel <tk@maintech.de>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:46 +02:00
Pierre Ossman
ad3868b2ec mmc,sdio: helper function for transfer padding
There are a lot of crappy controllers out there that cannot handle
all the request sizes that the MMC/SD/SDIO specifications require.
In case the card driver can pad the data to overcome the problems,
this commit adds a helper that calculates how much that padding
should be.

A corresponding helper is also added for SDIO, but it can also deal
with all the complexities of splitting up a large transfer efficiently.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:44 +02:00
Manuel Lauss
c4223c2c91 au1xmmc: remove db1200 board code, rewrite probe.
Remove the DB1200 board-specific functions (card present, read-only,
activity LED methods) and instead add platform data which is passed
to the driver.  This also allows for platforms to implement other
carddetect schemes (e.g. dedicated irq) without having to pollute the
driver code.  The poll timer (used for pb1200) is kept for compatibility.

With the board-specific stuff gone, the driver's ->probe() code can be
cleaned up considerably.

Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:43 +02:00
Marc Pignat
c5a89c6c08 mmc: at91_mci: avoid timeouts
The at91 mci controller internal state machine seems to often crash. This can
be fixed by resetting the controller after each command for at91rm9200 and by
setting the MCI_BLKR register on at91sam926*.

Signed-off-by: Marc Pignat <marc.pignat@hevs.ch>
Signed-off-by: Hans J Koch <hjk@linutronix.de>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:42 +02:00
Anton Vorontsov
08f80bb519 mmc: change .get_ro() callback semantics
Now get_ro() callback must return 0/1 values for its logical states, and
negative errno values in case of error. If particular host instance doesn't
support RO/WP switch, it should return -ENOSYS.

This patch changes some hosts in two ways:

1. Now functions should be smart to not return negative values in
   "RO asserted" case (particularly gpio_ calls could return negative
   values for the outermost GPIOs).

   Also, board code usually passes get_ro() callbacks that directly return
   gpioreg & bit result, so at91_mci, imxmmc, pxamci and mmc_spi's get_ro()
   handlers need take special care when returning platform's values to the
   mmc core.

2. In case of host instance didn't implement get_ro() callback, it should
   really return -ENOSYS and let the mmc core decide what to do about it
   (mmc core thinks the same way as the hosts, so it isn't functional
   change).

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:41 +02:00
Anton Vorontsov
619ef4b421 mmc_spi: add support for card-detection polling
This patch adds new platform data variable "caps", so platforms
could pass theirs capabilities into MMC core (for example, platforms
without interrupt on the CD line will most probably want to pass
MMC_CAP_NEEDS_POLL).

New platform get_cd() callback provided to optimize polling.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:41 +02:00
Anton Vorontsov
28f52482b4 mmc: add support for card-detection polling
Some hosts (and boards that use mmc_spi) do not use interrupts on the CD
line, so they can't trigger mmc_detect_change. We want to poll the card
and see if there was a change. 1 second poll interval seems resonable.

This patch also implements .get_cd() host operation, that could be used
by the hosts that are able to report card-detect status without need to
talk MMC.

Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:41 +02:00
Adrian Bunk
150a55683b include/linux/mmc/mmc.h: remove CVS tags
This patch removes a CVS tag that wasn't updated for a long time.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:41 +02:00
Pierre Ossman
4489428ab5 sdhci: support JMicron secondary interface
JMicron chips sometimes have two interfaces to work around limitations
in Microsoft's sdhci driver. This patch allows us to use either interface.

Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
2008-07-15 14:14:40 +02:00
Michael Hennerich
1efc80b53e Blackfin arch: Functional power management support
Enable: PM_SUSPEND_MEM -> Blackfin Hibernate to SDRAM
This feature requires a special bootloader (u-boot)
supporting return from hibernate.

Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-07-19 16:57:32 +08:00
David S. Miller
e308a5d806 netdev: Add netdev->addr_list_lock protection.
Add netif_addr_{lock,unlock}{,_bh}() helpers.

Use them to protect operations that operate on or read
the network device unicast and multicast address lists.

Also use them in cases where the code simply wants to
block calls into the driver's ->set_rx_mode() and
->set_multicast_list() methods.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-15 00:13:44 -07:00
David S. Miller
f1f28aa351 netdev: Add addr_list_lock to struct net_device.
This will be used to protect the per-device unicast and multicast
address lists, as well as the callbacks into the drivers which
configure such state such as ->set_rx_mode() and ->set_multicast_list().

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-15 00:08:33 -07:00
Or Gerlitz
64c5e613b9 RDMA/addr: Keep pointer to netdevice in struct rdma_dev_addr
Keep a pointer to the local (src) netdevice in struct rdma_dev_addr,
and copy it in as part of rdma_copy_addr().  Use rdma_translate_ip()
in cma_new_conn_id() to reduce some code duplication and also make
sure the src_dev member gets set.

In a high-availability configuration the netdevice pointer can be used
by the RDMA CM to align RDMA sessions to use the same links as the IP
stack does under fail-over and route change cases.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14 23:48:53 -07:00
Steve Wise
96f15c0353 RDMA/core: Add local DMA L_Key support
- Change the IB_DEVICE_ZERO_STAG flag to the transport-neutral name
  IB_DEVICE_LOCAL_DMA_LKEY, which is used by iWARP RNICs to indicate 0
  STag support and IB HCAs to indicate reserved L_Key support.

- Add a u32 local_dma_lkey member to struct ib_device.  Drivers fill
  this in with the appropriate local DMA L_Key (if they support it).

- Fix up the drivers using this flag.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14 23:48:53 -07:00
Ron Livne
521e575b9a IB/mlx4: Add support for blocking multicast loopback packets
Add support for handling the IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK
flag by using the per-multicast group loopback blocking feature of
mlx4 hardware.

Signed-off-by: Ron Livne <ronli@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14 23:48:48 -07:00
Ron Livne
47ee1b9f2e IB/core: Add support for multicast loopback blocking
This patch also adds a creation flag for QPs,
IB_QP_CREATE_MULTICAST_BLOCK_LOOPBACK, which when set means that
multicast sends from the QP to a group that the QP is attached to will
not be looped back to the QP's receive queue.  This can be used to
save receive resources when a consumer does not want a local copy of
multicast traffic; for example IPoIB must waste CPU time throwing away
such local copies of multicast traffic.

This patch also adds a device capability flag that shows whether a
device supports this feature or not.

Signed-off-by: Ron Livne <ronli@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14 23:48:48 -07:00
Steve Wise
7f624d023b RDMA/core: Add iWARP protocol statistics attributes in sysfs
This patch adds a sysfs attribute group called "proto_stats" under
/sys/class/infiniband/$device/ and populates this group with protocol
statistics if they exist for a given device.  Currently, only iWARP
stats are defined, but the code is designed to allow InfiniBand
protocol stats if they become available.  These stats are per-device
and more importantly -not- per port.

Details:

- Add union rdma_protocol_stats in ib_verbs.h.  This union allows
  defining transport-specific stats.  Currently only iwarp stats are
  defined.

- Add struct iw_protocol_stats to define the current set of iwarp
  protocol stats.

- Add new ib_device method called get_proto_stats() to return protocol
  statistics.

- Add logic in core/sysfs.c to create iwarp protocol stats attributes
  if the device is an RNIC and has a get_proto_stats() method.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14 23:48:48 -07:00
Steve Wise
00f7ec36c9 RDMA/core: Add memory management extensions support
This patch adds support for the IB "base memory management extension"
(BMME) and the equivalent iWARP operations (which the iWARP verbs
mandates all devices must implement).  The new operations are:

 - Allocate an ib_mr for use in fast register work requests.

 - Allocate/free a physical buffer lists for use in fast register work
   requests.  This allows device drivers to allocate this memory as
   needed for use in posting send requests (eg via dma_alloc_coherent).

 - New send queue work requests:
   * send with remote invalidate
   * fast register memory region
   * local invalidate memory region
   * RDMA read with invalidate local memory region (iWARP only)

Consumer interface details:

 - A new device capability flag IB_DEVICE_MEM_MGT_EXTENSIONS is added
   to indicate device support for these features.

 - New send work request opcodes IB_WR_FAST_REG_MR, IB_WR_LOCAL_INV,
   IB_WR_RDMA_READ_WITH_INV are added.

 - A new consumer API function, ib_alloc_mr() is added to allocate
   fast register memory regions.

 - New consumer API functions, ib_alloc_fast_reg_page_list() and
   ib_free_fast_reg_page_list() are added to allocate and free
   device-specific memory for fast registration page lists.

 - A new consumer API function, ib_update_fast_reg_key(), is added to
   allow the key portion of the R_Key and L_Key of a fast registration
   MR to be updated.  Consumers call this if desired before posting
   a IB_WR_FAST_REG_MR work request.

Consumers can use this as follows:

 - MR is allocated with ib_alloc_mr().

 - Page list memory is allocated with ib_alloc_fast_reg_page_list().

 - MR R_Key/L_Key "key" field is updated with ib_update_fast_reg_key().

 - MR made VALID and bound to a specific page list via
   ib_post_send(IB_WR_FAST_REG_MR)

 - MR made INVALID via ib_post_send(IB_WR_LOCAL_INV),
   ib_post_send(IB_WR_RDMA_READ_WITH_INV) or an incoming send with
   invalidate operation.

 - MR is deallocated with ib_dereg_mr()

 - page lists dealloced via ib_free_fast_reg_page_list().

Applications can allocate a fast register MR once, and then can
repeatedly bind the MR to different physical block lists (PBLs) via
posting work requests to a send queue (SQ).  For each outstanding
MR-to-PBL binding in the SQ pipe, a fast_reg_page_list needs to be
allocated (the fast_reg_page_list is owned by the low-level driver
from the consumer posting a work request until the request completes).
Thus pipelining can be achieved while still allowing device-specific
page_list processing.

The 32-bit fast register memory key/STag is composed of a 24-bit index
and an 8-bit key.  The application can change the key each time it
fast registers thus allowing more control over the peer's use of the
key/STag (ie it can effectively be changed each time the rkey is
rebound to a page list).

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14 23:48:45 -07:00
Dotan Barak
4deccd6d95 RDMA: Improve include file coding style
Remove subversion $Id lines and improve readability by fixing other
coding style problems pointed out by checkpatch.pl.

Signed-off-by: Dotan Barak <dotanba@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14 23:48:44 -07:00
Sean Hefty
a947491709 RDMA: Fix license text
The license text for several files references a third software license
that was inadvertently copied in.  Update the license to what was
intended.  This update was based on a request from HP.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-14 23:48:43 -07:00
Pavel Emelyanov
f66ac03d49 mib: add struct net to ICMPMSGIN_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:31 -07:00
Pavel Emelyanov
903fc1964e mib: add struct net to ICMPMSGOUT_INC_STATS
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:30 -07:00
Pavel Emelyanov
dcfc23cac1 mib: add struct net to ICMP_INC_STATS_BH
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:29 -07:00
Pavel Emelyanov
75c939bb4d mib: add struct net to ICMP_INC_STATS
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:28 -07:00
Pavel Emelyanov
43589aa93c icmp: drop unused MIB accounting wrappers
There are ICMP_XXX_STATS that are not used in the kernel, so I remove
them, not to "just patch" them later. But if there's some sense in
keeping them, kick me - I will remake this set keeping them.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:26 -07:00
Pavel Emelyanov
0388b00426 icmp: add struct net argument to icmp_out_count
This routine deals with ICMP statistics, but doesn't have a
struct net at hands, so add one.

Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 23:05:13 -07:00
Patrick McHardy
393e52e33c packet: deliver VLAN TCI to userspace
Store the VLAN tag in the auxillary data/tpacket2_hdr so userspace can
properly deal with hardware VLAN tagging/stripping.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 22:50:39 -07:00
Patrick McHardy
bbd6ef87c5 packet: support extensible, 64 bit clean mmaped ring structure
The tpacket_hdr is not 64 bit clean due to use of an unsigned long
and can't be extended because the following struct sockaddr_ll needs
to be at a fixed offset.

Add support for a version 2 tpacket protocol that removes these
limitations.

Userspace can query the header size through a new getsockopt option
and change the protocol version through a setsockopt option. The
changes needed to switch to the new protocol version are:

1. replace struct tpacket_hdr by struct tpacket2_hdr
2. query header len and save
3. set protocol version to 2
 - set up ring as usual
4. for getting the sockaddr_ll, use (void *)hdr + TPACKET_ALIGN(hdrlen)
   instead of (void *)hdr + TPACKET_ALIGN(sizeof(struct tpacket_hdr))

Steps 2 and 4 can be omitted if the struct sockaddr_ll isn't needed.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 22:50:15 -07:00
Patrick McHardy
bc1d0411b8 vlan: deliver packets received with VLAN acceleration to network taps
When VLAN header stripping is used, packets currently bypass packet
sockets (and other network taps) completely. For locally existing
VLANs, they appear directly on the VLAN device, for unknown VLANs
they are silently dropped.

Add a new function netif_nit_deliver() to deliver incoming packets
to all network interface taps and use it in __vlan_hwaccel_rx() to
make VLAN packets visible on the underlying device.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 22:49:30 -07:00
Patrick McHardy
6aa895b047 vlan: Don't store VLAN tag in cb
Use a real skb member to store the skb to avoid clashes with qdiscs,
which are allowed to use the cb area themselves. As currently only real
devices that consume the skb set the NETIF_F_HW_VLAN_TX flag, no explicit
invalidation is neccessary.

The new member fills a hole on 64 bit, the skb layout changes from:

        __u32                      mark;                 /*   172     4 */
        sk_buff_data_t             transport_header;     /*   176     4 */
        sk_buff_data_t             network_header;       /*   180     4 */
        sk_buff_data_t             mac_header;           /*   184     4 */
        sk_buff_data_t             tail;                 /*   188     4 */
        /* --- cacheline 3 boundary (192 bytes) --- */
        sk_buff_data_t             end;                  /*   192     4 */

        /* XXX 4 bytes hole, try to pack */

to

        __u32                      mark;                 /*   172     4 */
        __u16                      vlan_tci;             /*   176     2 */

        /* XXX 2 bytes hole, try to pack */

        sk_buff_data_t             transport_header;     /*   180     4 */
        sk_buff_data_t             network_header;       /*   184     4 */

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 22:49:06 -07:00
Dave Airlie
242e3df80b drm/radeon: fixup issue with radeon and PAT support.
With new userspace libpciaccess we can get a conflicting mapping
on the PCIE GART table in the video RAM. Always try and map it _wc.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2008-07-15 15:48:05 +10:00
Benjamin Herrenschmidt
43d2548bb2 Merge commit '85082fd7cbe3173198aac0eb5e85ab1edcc6352c' into test-build
Manual fixup of:

	arch/powerpc/Kconfig
2008-07-15 15:44:51 +10:00
Kumar Gala
585583d95c powerpc: Fix pte_update for CONFIG_PTE_64BIT and !PTE_ATOMIC_UPDATES
Because the pte is now 64-bits the compiler was optimizing the update
to always clear the upper 32-bits of the pte.  We need to ensure the
clr mask is treated as an unsigned long long to get the proper behavior.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2008-07-15 15:44:02 +10:00
Allan Stephens
0ea522416b tipc: Remove unneeded parameter to tipc_createport_raw()
This patch eliminates an unneeded parameter when creating a low-level
TIPC port object.  Instead of returning both the pointer to the port
structure and the port's reference ID, it now returns only the pointer
since the port structure contains the reference ID as one of its fields.

Signed-off-by: Allan Stephens <allan.stephens@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 22:42:19 -07:00
David S. Miller
925068dcdc Merge branch 'davem-next' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2008-07-14 22:30:17 -07:00
Max Krasnyansky
f271b2cc78 tun: Fix/rewrite packet filtering logic
Please see the following thread to get some context on this
	http://marc.info/?l=linux-netdev&m=121564433018903&w=2

Basically the issue is that current multi-cast filtering stuff in
the TUN/TAP driver is seriously broken.
Original patch went in without proper review and ACK. It was broken and
confusing to start with and subsequent patches broke it completely.
To give you an idea of what's broken here are some of the issues:

- Very confusing comments throughout the code that imply that the
character device is a network interface in its own right, and that packets
are passed between the two nics. Which is completely wrong.

- Wrong set of ioctls is used for setting up filters. They look like
shortcuts for manipulating state of the tun/tap network interface but
in reality manipulate the state of the TX filter.

- ioctls that were originally used for setting address of the the TX filter
got "fixed" and now set the address of the network interface itself. Which
made filter totaly useless.

- Filtering is done too late. Instead of filtering early on, to avoid
unnecessary wakeups, filtering is done in the read() call.

The list goes on and on :)

So the patch cleans all that up. It introduces simple and clean interface for
setting up TX filters (TUNSETTXFILTER + tun_filter spec) and does filtering
before enqueuing the packets.

TX filtering is useful in the scenarios where TAP is part of a bridge, in
which case it gets all broadcast, multicast and potentially other packets when
the bridge is learning. So for example Ethernet tunnelling app may want to
setup TX filters to avoid tunnelling multicast traffic. QEMU and other
hypervisors can push RX filtering that is currently done in the guest into the
host context therefore saving wakeups and unnecessary data transfer.

Signed-off-by: Max Krasnyansky <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 22:18:19 -07:00
Bryan Wu
1c0d20cd29 Blackfin arch: add TXDWA definition to enable new feature
Signed-off-by: Bryan Wu <cooloney@kernel.org>
2008-07-15 12:08:50 +08:00
Roland Dreier
4d3702b62e x86: Rename "ignore" macro in <asm/dwarf2.h> to avoid collision
Commit 70f1bba4 ("x86: use ignore macro instead of hash comment") breaks
the 64-bit x86 build on toolchains that have CONFIG_AS_CFI undefined with:

    arch/x86/lib/csum-copy_64.S:48: Error: Macro `ignore' was already defined

because <asm/dwarf2.h> now uses the ignore macro name itself.  Fix this
by changing to __cfi_ignore in dwarf2.h.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-14 20:40:39 -07:00
David S. Miller
fc943b12e4 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2008-07-14 20:40:34 -07:00
Patrick McHardy
72d9794f44 net-sched: cls_flow: add perturbation support
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-14 20:36:32 -07:00
David S. Miller
0c4c8cae44 Merge branch 'master' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6 2008-07-14 20:32:07 -07:00
David S. Miller
2aec609fb4 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:

	net/netfilter/nf_conntrack_proto_tcp.c
2008-07-14 20:23:54 -07:00