Commit Graph

780 Commits

Author SHA1 Message Date
David S. Miller
afd69ed142 [SPARC64]: Do not flood log with failed DS messages.
When booting up a control node it's quite common to
not be able to register several service types.

And likewise on guests at least one or two are going
to not be there.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-20 17:14:38 -07:00
David S. Miller
5fc986100c [SPARC64]: Handle multiple domain-services-port nodes properly.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-20 17:14:23 -07:00
David S. Miller
58fb666643 [SPARC64]: Improve VIO device naming further.
The best scheme to get uniqueness seems to be:

FOO			-- If node lacks "id" property
FOO-$(ID)		-- If node has "id" but parent lacks "cfg-handle"
FOO-$(ID)-$(CFG_HANDLE) -- If node has both

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-20 17:14:13 -07:00
David S. Miller
3d6e470236 [SPARC]: Make sure dev_archdata is filled in for all devices.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-20 17:13:42 -07:00
David S. Miller
c73fcc846c [SPARC]: Fix serial console device detection.
The current scheme works on static interpretation of text names, which
is wrong.

The output-device setting, for example, must be resolved via an alias
or similar to a full path name to the console device.

Paths also contain an optional set of 'options', which starts with a
colon at the end of the path.  The option area is used to specify
which of two serial ports ('a' or 'b') the path refers to when a
device node drives multiple ports.  'a' is assumed if the option
specification is missing.

This was caught by the UltraSPARC-T1 simulator.  The 'output-device'
property was set to 'ttya' and we didn't pick upon the fact that this
is an OBP alias set to '/virtual-devices/console'.  Instead we saw it
as the first serial console device, instead of the hypervisor console.

The infrastructure is now there to take advantage of this to resolve
the console correctly even in multi-head situations in fbcon too.

Thanks to Greg Onufer for the bug report.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-20 16:59:26 -07:00
Linus Torvalds
2cb7e71422 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/ofcons
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/ofcons:
  Create drivers/of/platform.c
  Create linux/of_platorm.h
  [SPARC/64] Rename some functions like PowerPC
  Begin consolidation of of_device.h
  Begin to consolidate of_device.c
  Consolidate of_find_node_by routines
  Consolidate of_get_next_child
  Consolidate of_get_parent
  Consolidate of_find_property
  Consolidate of_device_is_compatible
  Start split out of common open firmware code
  Split out common parts of prom.h
2007-07-20 09:18:08 -07:00
David S. Miller
78d0012539 [SPARC64]: Fix two year old bug in early bootup asm.
We try to fetch the CIF entry pointer from %o4, but that
can get clobbered by the early OBP calls.  It is saved
in %l7 already, so actually this "mov %o4, %l7" can just
be completely removed with no other changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-19 21:50:09 -07:00
Fabio Massimo Di Nitto
74121b699c [SPARC64]: Fix log message type in vio_create_one().
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-19 21:28:53 -07:00
David S. Miller
5f7426c0e1 [SPARC64]: Tweak assertions in sun4v_build_virq().
They are too strict.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-19 21:28:43 -07:00
David S. Miller
2a26302164 [SPARC64]: Tweak kernel log messages in power_probe().
Use KERN_INFO, add missing newline, etc.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-19 21:27:39 -07:00
David S. Miller
91ba3c2128 [SPARC64]: Fix handling of multiple vdc-port nodes.
The "id" property in vdc-port nodes are not unique, they
are all zero.  Therefore assign ID's using the parent's
"cfg-handle" property which will be unique.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-19 21:27:18 -07:00
Fabio Massimo Di Nitto
48db7b7c50 [SPARC64]: Fix device type matching in VIO's devspec_show().
with the recent renames, we forgot to update the matches for
devspec. This is required to keep udev working and autoload modules.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-19 21:27:10 -07:00
David S. Miller
bc5a2e64a1 [SPARC]: Add sys_fallocate() entries.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-19 21:26:47 -07:00
David S. Miller
a376178011 [SPARC64]: Use orderly_poweroff().
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-19 21:26:42 -07:00
Stephen Rothwell
3f23de10f2 Create drivers/of/platform.c
and populate it with the common parts from PowerPC and Sparc[64].

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-20 14:25:51 +10:00
Stephen Rothwell
37b7754aab [SPARC/64] Rename some functions like PowerPC
This is to make the of merge easier.  Also rename of_bus_type.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-20 14:24:53 +10:00
Stephen Rothwell
f85ff3056c Begin to consolidate of_device.c
This moves all the common parts for the Sparc, Sparc64 and PowerPC
of_device.c files into drivers/of/device.c.

Apart from the simple move, Sparc gains of_match_node() and a call to
of_node_put in of_release_dev().  PowerPC gains better recovery if
device_create_file() fails in of_device_register().

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-20 13:39:59 +10:00
Stephen Rothwell
1ef4d4242d Consolidate of_find_node_by routines
This consolidates the routines of_find_node_by_path, of_find_node_by_name,
of_find_node_by_type and of_find_compatible_device.  Again, the comparison
of strings are done differently by Sparc and PowerPC and also these add
read_locks around the iterations.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-20 13:39:06 +10:00
Stephen Rothwell
d1cd355a5e Consolidate of_get_next_child
This adds a read_lock around the child/next accesses on Sparc.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-20 13:34:26 +10:00
Stephen Rothwell
e679c5f445 Consolidate of_get_parent
This requires creating dummy of_node_{get,put} routines for sparc and
sparc64.  It also adds a read_lock around the parent accesses.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-20 13:32:58 +10:00
Stephen Rothwell
581b605a83 Consolidate of_find_property
The only change here is that a readlock is taken while the property list
is being traversed on Sparc where it was not taken previously.

Also, Sparc uses strcasecmp to compare property names while PowerPC
uses strcmp.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-20 13:32:24 +10:00
Stephen Rothwell
0081cbc373 Consolidate of_device_is_compatible
The only difference here is that Sparc uses strncmp to match compatibility
names while PowerPC uses strncasecmp.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-20 13:29:51 +10:00
Stephen Rothwell
97e873e5c8 Start split out of common open firmware code
This creates drivers/of/base.c (depending on CONFIG_OF) and puts
the first trivially common bits from the prom.c files into it.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-20 13:28:41 +10:00
Fenghua Yu
5fb7dc37dc define new percpu interface for shared data
per cpu data section contains two types of data.  One set which is
exclusively accessed by the local cpu and the other set which is per cpu,
but also shared by remote cpus.  In the current kernel, these two sets are
not clearely separated out.  This can potentially cause the same data
cacheline shared between the two sets of data, which will result in
unnecessary bouncing of the cacheline between cpus.

One way to fix the problem is to cacheline align the remotely accessed per
cpu data, both at the beginning and at the end.  Because of the padding at
both ends, this will likely cause some memory wastage and also the
interface to achieve this is not clean.

This patch:

Moves the remotely accessed per cpu data (which is currently marked
as ____cacheline_aligned_in_smp) into a different section, where all the data
elements are cacheline aligned. And as such, this differentiates the local
only data and remotely accessed data cleanly.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: <linux-arch@vger.kernel.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 10:04:44 -07:00
David S. Miller
a5f8967e17 [SPARC64]: Set vio->desc_buf to NULL after freeing.
Otherwise we trigger assertions on the next link-up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 01:20:26 -07:00
David S. Miller
a4cd184503 [SPARC64]: Handle reset events in vio_link_state_change().
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 01:20:13 -07:00
David S. Miller
8a2950cce6 [SPARC64]: Handle LDC resets properly in domain-services driver.
Reset the handshake and per-capability state so that when the
link comes back up we'll renegotiate the DS version and then
reregister all of the services.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 01:20:09 -07:00
David S. Miller
6160f63518 [SPARC64]: Massively simplify VIO device layer and support hot add/remove.
Create and destroy VIO devices in response to MD update events.  These
run synchronously inside of the MD update mutex so the VIO layer
doesn't need to do internal locking of any sort.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 01:20:04 -07:00
David S. Miller
920c3ed741 [SPARC64]: Add basic infrastructure for MD add/remove notification.
And add dummy handlers for the VIO device layer.  These will be filled
in with real code after the vdc, vnet, and ds drivers are reworked to
have simpler dependencies on the VIO device tree.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-18 01:19:51 -07:00
Oleg Nesterov
62715ec832 [SPARC64]: Kill bogus set_fs(KERNEL_DS) in do_rt_sigreturn().
From: Oleg Nesterov <oleg@tv-sign.ru>

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-17 14:37:54 -07:00
David S. Miller
41120551fa [SPARC64]: Kill explicit %gl register reference.
Older binutils can't handle it.  Use SET_GL() instead,
which is explicitly for this purpose.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-17 12:18:15 -07:00
Pavel Emelianov
bcdcd8e725 Report that kernel is tainted if there was an OOPS
If the kernel OOPSed or BUGed then it probably should be considered as
tainted.  Thus, all subsequent OOPSes and SysRq dumps will report the
tainted kernel.  This saves a lot of time explaining oddities in the
calltraces.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
[ Added parisc patch from Matthew Wilson  -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-17 10:23:02 -07:00
David S. Miller
778feeb475 [SPARC64]: Fix race between MD update and dr-cpu add.
We need to make sure the MD update occurs before we try to
process dr-cpu configure requests.  MD update and dr-cpu
were being processed by seperate threads so that did not
happen occaisionally.

Fix this by executing all domain services data packets from
a single thread, in order.

This will help simplify some other things as well.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 17:11:59 -07:00
Fabio Massimo Di Nitto
3ac66e33ea [SPARC64]: SMP build fix.
The UP build fix had some unintended consequences.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 17:11:58 -07:00
David S. Miller
d54bc2793e [SPARC64]: Fix UP build.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:41:51 -07:00
David S. Miller
e0204409df [SPARC64]: dr-cpu unconfigure support.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:32 -07:00
David S. Miller
9918cc2e32 [SPARC64]: Give more accurate errors in dr_cpu_configure().
When cpu_up() fails, we can discern the most likely cause.

If cpu_present() is false, this means the cpu did not appear
in the MD.  If -ENODEV is the error return value, then
the processor did not boot properly into the kernel.

Pass this information back in the dr-cpu response packet.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:24 -07:00
David S. Miller
39dd992aee [SPARC64]: Clear cpu_{core,sibling}_map[] in smp_fill_in_sib_core_maps()
When we hot-plug in new cpus, the core_id and proc_id of existing
cpus can change.  So in order to set the cpu groups correctly we
need to clear the maps out completely first.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:19 -07:00
David S. Miller
b37d40d175 [SPARC64]: Fix leak when DR added cpu does not bootup.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:15 -07:00
David S. Miller
b53bcb6799 [SPARC64]: Add ->set_affinity IRQ handlers.
dr-cpu unconfigure requests will walk throught he enabled
IRQs and trigger ->set_affinity so that the going-down
cpu no longer has INOs targetted to it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:11 -07:00
David S. Miller
bd0e11ff22 [SPARC64]: Process dr-cpu events in a kthread instead of workqueue.
This will be necessary to handle unconfigure requests
properly.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:07 -07:00
David S. Miller
8b99cfb8cc [SPARC64]: More sensible udelay implementation.
Take a page from the powerpc folks and just calculate the
delay factor directly.

Since frequency scaling chips use a system-tick register,
the value is going to be the same system-wide.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:05:02 -07:00
David S. Miller
27a2ef382c [SPARC64]: SMP build fixes.
With the move of ldom_startcpu_cpuid() into smp.c some other
things need to follow along:

1) smp.c is not a driver so we can't use "PFX" macro in the
   printk calls.

2) smp.c now needs asm/io.h and asm/hvtramp.h, ds.c no longer
   does

3) kimage_addr_to_ra() also needs to move into smp.c

While we're here, update copyright info and my email address
in smp.c

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:58 -07:00
David S. Miller
8f3fff2050 [SPARC64]: mdesc.c needs linux/mm.h
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:53 -07:00
David S. Miller
b14f5c100c [SPARC64]: Fix build regressions added by dr-cpu changes.
Do not select HOTPLUG_CPU from SUN_LDOMS, that causes
HOTPLUG_CPU to be selected even on non-SMP which is
illegal.

Only build hvtramp.o when SMP, just like trampoline.o

Protect dr-cpu code in ds.c with HOTPLUG_CPU.

Likewise move ldom_startcpu_cpuid() to smp.c and protect
it and the call site with SUN_LDOMS && HOTPLUG_CPU.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:49 -07:00
David S. Miller
f8be339c02 [SPARC64]: Unconditionally register vio_bus_type.
The VIO drivers register themselves unconditionally just
like those of any other bus type, so to avoid crashes
on non-VIO systems we need to always register vio_bus_type.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:45 -07:00
David S. Miller
4f0234f4f9 [SPARC64]: Initial LDOM cpu hotplug support.
Only adding cpus is supports at the moment, removal
will come next.

When new cpus are configured, the machine description is
updated.  When we get the configure request we pass in a
cpu mask of to-be-added cpus to the mdesc CPU node parser
so it only fetches information for those cpus.  That code
also proceeds to update the SMT/multi-core scheduling bitmaps.

cpu_up() does all the work and we return the status back
over the DS channel.

CPUs via dr-cpu need to be booted straight out of the
hypervisor, and this requires:

1) A new trampoline mechanism.  CPUs are booted straight
   out of the hypervisor with MMU disabled and running in
   physical addresses with no mappings installed in the TLB.

   The new hvtramp.S code sets up the critical cpu state,
   installs the locked TLB mappings for the kernel, and
   turns the MMU on.  It then proceeds to follow the logic
   of the existing trampoline.S SMP cpu bringup code.

2) All calls into OBP have to be disallowed when domaining
   is enabled.  Since cpus boot straight into the kernel from
   the hypervisor, OBP has no state about that cpu and therefore
   cannot handle being invoked on that cpu.

   Luckily it's only a handful of interfaces which can be called
   after the OBP device tree is obtained.  For example, rebooting,
   halting, powering-off, and setting options node variables.

CPU removal support will require some infrastructure changes
here.  Namely we'll have to process the requests via a true
kernel thread instead of in a workqueue.  workqueues run on
a per-cpu thread, but when unconfiguring we might need to
force the thread to execute on another cpu if the current cpu
is the one being removed.  Removal of a cpu also causes the kernel
to destroy that cpu's workqueue running thread.

Another issue on removal is that we may have interrupts still
pointing to the cpu-to-be-removed.  So new code will be needed
to walk the active INO list and retarget those cpus as-needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:40 -07:00
David S. Miller
b3e13fbeb9 [SPARC64]: Fix setting of variables in LDOM guest.
There is a special domain services capability for setting
variables in the OBP options node.  Guests don't have permanent
store for the OBP variables like a normal system, so they are
instead maintained in the LDOM control node or in the SC.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:36 -07:00
David S. Miller
83292e0a9c [SPARC64]: Fix MD property lifetime bugs.
Property values cannot be referenced outside of
mdesc_grab()/mdesc_release() pairs.  The only major
offender was the VIO bus layer, easily fixed.

Add some commentary to mdesc.h describing these rules.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:33 -07:00
David S. Miller
43fdf27470 [SPARC64]: Abstract out mdesc accesses for better MD update handling.
Since we have to be able to handle MD updates, having an in-tree
set of data structures representing the MD objects actually makes
things more painful.

The MD itself is easy to parse, and we can implement the existing
interfaces using direct parsing of the MD binary image.

The MD is now reference counted, so accesses have to now take the
form:

	handle = mdesc_grab();

	... operations on MD ...

	mdesc_release(handle);

The only remaining issue are cases where code holds on to references
to MD property values.  mdesc_get_property() returns a direct pointer
to the property value, most cases just pull in the information they
need and discard the pointer, but there are few that use the pointer
directly over a long lifetime.  Those will be fixed up in a subsequent
changeset.

A preliminary handler for MD update events from domain services is
there, it is rudimentry but it works and handles all of the reference
counting.  It does not check the generation number of the MDs,
and it does not generate a "add/delete" list for notification to
interesting parties about MD changes but that will be forthcoming.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:28 -07:00
David S. Miller
133f09a169 [SPARC64]: Use more mearningful names for IRQ registry.
All of the interrupts say "LDX RX" and "LDX TX" currently
which is next to useless.  Put a device specific prefix
before "RX" and "TX" instead which makes it much more
useful.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:24 -07:00
David S. Miller
e450992d13 [SPARC64]: Initial domain-services driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:20 -07:00
David S. Miller
13077d8028 [SPARC64]: Export powerd facilities for external entities.
Besides the existing usage for power-button interrupts, we'll
want to make use of this code for domain-services where the
LDOM manager can send reboot requests to the guest node.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:16 -07:00
David S. Miller
2c4f4ecb7a [SPARC64]: Add domain-services nodes to VIO device tree.
They sit under the root of the MD tree unlike the rest of
the LDC channel based virtual devices.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:13 -07:00
David S. Miller
cb48123584 [SPARC64]: Assorted LDC bug cures.
1) LDC_MODE_RELIABLE is deprecated an unused by anything, plus
   it and LDC_MODE_STREAM were mis-numbered.

2) read_stream() should try to read as much as possible into
   the per-LDC stream buffer area, so do not trim the read_nonraw()
   length by the caller's size parameter.

3) Send data ACKs when necessary in read_nonraw().

4) In read_nonraw() when we get a pure ACK, advance the RX head
   unconditionally past it.

5) Provide the ACKID field in the ldcdgb() packet dump in read_nonraw().
   This helps debugging stream mode LDC channel problems.

6) Decrease verbosity of rx_data_wait() so that it is more useful.
   A debugging message each loop iteration is too much.

7) In process_data_ack() stop the loop checking when we hit lp->tx_tail
   not lp->tx_head.

8) Set the seqid field properly in send_data_nack().

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:09 -07:00
David S. Miller
5a606b72a4 [SPARC64]: Do not ACK an INO if it is disabled or inprogress.
This is also a partial workaround for a bug in the LDOM firmware which
double-transmits RX inos during high load.  Without this, such an
event causes the kernel to loop forever in the interrupt call chain
ACK'ing but never actually running the IRQ handler (and thus clearing
the interrupt condition in the device).

There is still a bad potential effect when double INOs occur,
not covered by this changeset.  Namely, if the INO is already on
the per-cpu INO vector list, we still blindly re-insert it and
thus we can end up losing interrupts already linked in after
it.

We could deal with that by traversing the list before insertion,
but that's too expensive for this edge case.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:04:05 -07:00
David S. Miller
e53e97ce3c [SPARC64]: Add LDOM virtual channel driver and VIO device layer.
Virtual devices on Sun Logical Domains are built on top
of a virtual channel framework.  This, with help of hypervisor
interfaces, provides a link layer protocol with basic
handshaking over which virtual device clients and servers
communicate.

Built on top of this is a VIO device protocol which has it's
own handshaking and message types.  At this layer attributes
are exchanged (disk size, network device addresses, etc.)
descriptor rings are registered, and data transfers are
triggers and replied to.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-16 04:03:18 -07:00
Auke Kok
b8a3a5214d PCI: read revision ID by default
Currently there are 97 occurrences where drivers need the pci
revision ID. We can do this once for all devices. Even the pci
subsystem needs the revision several times for quirks. The extra
u8 member pads out nicely in the pci_dev struct.

Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:02:09 -07:00
Ingo Molnar
0437e109e1 sched: zap the migration init / cache-hot balancing code
the SMP load-balancer uses the boot-time migration-cost estimation
code to attempt to improve the quality of balancing. The reason for
this code is that the discrete priority queues do not preserve
the order of scheduling accurately, so the load-balancer skips
tasks that were running on a CPU 'recently'.

this code is fundamental fragile: the boot-time migration cost detector
doesnt really work on systems that had large L3 caches, it caused boot
delays on large systems and the whole cache-hot concept made the
balancing code pretty undeterministic as well.

(and hey, i wrote most of it, so i can say it out loud that it sucks ;-)

under CFS the same purpose of cache affinity can be achieved without
any special cache-hot special-case: tasks are sorted in the 'timeline'
tree and the SMP balancer picks tasks from the left side of the
tree, thus the most cache-cold task is balanced automatically.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-07-09 18:51:57 +02:00
David S. Miller
a357b8f42e [SPARC64]: Need to set state to IDLE during sun4v IRQ enable.
This fixes hypervisor console interrupts on LDOM guests.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-26 00:13:31 -07:00
David S. Miller
1245088400 [SPARC64]: Fix VIRQ enabling.
We were doing the wrong call to turn them on, and also
when enabling we need to forcefully set the state to IDLE.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-26 00:13:09 -07:00
David S. Miller
fc395f8d58 [SPARC64]: Fix args to sun4v_ldc_revoke().
First argument is LDC channel ID, then mapping cookie,
then the MTE revoke cookie.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:27 -07:00
David S. Miller
56f5c0bd50 [SPARC64]: Fix IO/MEM space sizing for PCI.
In pci_determine_mem_io_space(), do not hard code the region sizes.
Instead, use the values given to us in the ranges property.

Thanks goes to Mikael Petterson for the original Xorg failure
bug repoert, and strace dumps from Mikael and Dmitry Artamonow.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:19 -07:00
David S. Miller
4a907dec98 [SPARC64]: Wire up cookie based sun4v interrupt registry.
This will be used for logical domain channel interrupts.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-13 00:01:04 -07:00
David S. Miller
8c2786cfa6 [SPARC64]: Handle PCI bridges without 'ranges' property.
This fixes the IDE controller not showing up on Netra-T1
systems.

Just like Simba bridges, some PCI bridges can lack the
'ranges' OBP property.  So we handle this similarly to
the existing Simba code:

1) In of_device register address resolving, we push the
   translation to the parent.

2) In PCI device scanning, we interrogate the PCI config
   space registers of the PCI bus device in order to resolve
   the resources, just like the generic Linux PCI probing
   code does.

With much help and testing from Fabio, who also reported
the initial problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Fabio Massimo Di Nitto <fabbione@ubuntu.com>
2007-06-07 21:59:44 -07:00
Robert P. J. Day
ea1ff19ce0 [SPARC64]: Include <linux/rwsem.h> instead of <asm/rwsem.h>.
To be consistent with other architectures, include the generic version
of rwsem.h.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 20:24:50 -07:00
David S. Miller
ec4d18f219 [SPARC64]: Fix SBUS IRQ regression caused by PCI-E driver.
We used to access the 64-bit IRQ IMAP and ICLR registers of bus
controllers 4-bytes in and as a 32-bit register word, since only the
low 32-bits were relevant.  This seemed like a good idea at the time.

But the PCI-E controller requires full 8-byte 64-bit access to
these registers, so we switched over to accessing them fully.

SBUS was not adjusted properly, which broke interrupts completely.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 16:59:51 -07:00
David S. Miller
321566c250 [SPARC64]: Fix 2 bugs in PCI Sabre bus scanning.
If we are on hummingbird, bus runs at 66MHZ.

pbm->pci_bus should be setup with the result of pci_scan_one_pbm()
or else we deref NULL pointers in the error interrupt handlers.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-07 16:59:46 -07:00
David S. Miller
a2f9f6bbb3 [SPARC64]: Fix {mc,smt}_capable().
It's not just sun4v hypervisor platforms that should return true
for this, sun4u with UltraSPARC-IV should return true too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:05 -07:00
David S. Miller
5cd342df96 [SPARC64]: Make core and sibling groups equal on UltraSPARC-IV.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:02 -07:00
David S. Miller
f78eae2e6f [SPARC64]: Proper multi-core scheduling support.
The scheduling domain hierarchy is:

   all cpus -->
      cpus that share an instruction cache -->
          cpus that share an integer execution unit

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:50:00 -07:00
David Miller
d887ab3a9b [SPARC64]: Provide mmu statistics via sysfs.
If the system supports hypervisor based statistics, allow them to
be fetched, enabled, and disabled via sysfs.

Enable and disable via the boolean:

/sys/devices/systems/cpu/cpuN/mmustat_enable

Statistic values are provided under:

/sys/devices/systems/cpu/cpuN/mmu_status/

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:57 -07:00
David Miller
48b6735640 [SPARC64]: Fix service channel hypervisor function names.
sed 's/scv/svc/'

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:54 -07:00
David S. Miller
d1f253e60a [SPARC64]: Export basic cpu properties via sysfs.
Cache sizes, udelay_val, and clock_tick.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:51 -07:00
David S. Miller
eff3414b72 [SPARC64]: Move topology init code into new file, sysfs.c
Also, use per-cpu data for struct cpu.  Calling kmalloc for
each cpu in topology_init() is just plain clumsy.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-06-04 21:49:50 -07:00
Linus Torvalds
54ca412336 Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fix:
  sparc64: fix alignment bug in linker definition script
2007-05-31 12:33:16 -07:00
David S. Miller
dbbe3cb8cf [SPARC64]: Add missing NCS and SVC hypervisor interfaces.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-31 01:52:48 -07:00
Sam Ravnborg
4096b46f01 sparc64: fix alignment bug in linker definition script
The RO_DATA section were hardcoded to a specific
alignment in include/asm-generic/vmlinux.h.
But for sparc64 this did not match the PAGE_SIZE.

Introduce a new section definition named:
RO_DATA that takes actual alignment as parameter.
RODATA are provided for backward compatibility.

On top of this avoid hardcoding alignment for
sparc64 in reset of the script
Fix is build-tested on sparc64 + x86_64.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-29 21:29:00 +02:00
David S. Miller
7db35f31cb [SPARC64]: Fill holes in hypervisor APIs and fix KTSB registry.
Several interfaces were missing and others misnumbered or
improperly documented.

Also, make sure to check the return value when registering
the kernel TSBs with the hypervisor.  This helped to find
the 4MB kernel TSB alignment bug fixed in a previous changeset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:52:15 -07:00
David S. Miller
2d9e2763c2 [SPARC64]: Fix two bugs wrt. kernel 4MB TSB.
1) The TSB lookup was not using the correct hash mask.

2) It was not aligned on a boundary equal to it's size,
   which is required by the sun4v Hypervisor.

wasn't having it's return value checked, and that bug will be fixed up
as well in a subsequent changeset.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:51:38 -07:00
David S. Miller
679292993c [SPARC64]: Fix _PAGE_EXEC_4U check in sun4u I-TLB miss handler.
It was using an immediate _PAGE_EXEC_4U value in an 'and'
instruction to perform the test.  This doesn't work because
the immediate field is signed 13-bit, this the mask being
tested against the PTE was 0x1000 sign-extended to 32-bits
instead of just plain 0x1000.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:50:15 -07:00
Horst H. von Brand
7189859f28 [SPARC64]: arch/sparc64/time.c doesn't compile on Ultra 1 (no PCI)
This is bug 8540 on bugzilla.kernel.org

arch/sparc64/time.c contains references to assorted bq4802 stuff if
CONFIG_PCI is not set, and compile fails. I #ifdef'ed out everything
that looks PCI-ish in that file.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:50:02 -07:00
David S. Miller
22adb358e8 [SPARC64]: Eliminate NR_CPUS limitations.
Cheetah systems can have cpuids as large as 1023, although physical
systems don't have that many cpus.

Only three limitations existed in the kernel preventing arbitrary
NR_CPUS values:

1) dcache dirty cpu state stored in page->flags on
   D-cache aliasing platforms.  With some build time
   calculations and some build-time BUG checks on
   page->flags layout, this one was easily solved.

2) The cheetah XCALL delivery code could only handle
   a cpumask with up to 32 cpus set.  Some simple looping
   logic clears that up too.

3) thread_info->cpu was a u8, easily changed to a u16.

There are a few spots in the kernel that still put NR_CPUS
sized arrays on the kernel stack, but that's not a sparc64
specific problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:49 -07:00
David S. Miller
5cbc307373 [SPARC64]: Use machine description and OBP properly for cpu probing.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:41 -07:00
David S. Miller
e01c0d6d8c [SPARC64]: Negotiate hypervisor API for PCI services.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:34 -07:00
David S. Miller
22d6a1cba3 [SPARC64]: Report proper system soft state to the hypervisor.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:29 -07:00
David S. Miller
36b48973b8 [SPARC64]: Fix typo in sun4v_hvapi_register error handling.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:21 -07:00
David S. Miller
5840fc66bb [SPARC64]: PCI device scan is way too verbose by default.
These messages were very useful when bringing up the
OBP based PCI device scan code, but it's just a lot
of noise every bootup now especially on big machines.

The messages can be re-enabled via 'ofpci_debug=1' on
the kernel command line.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:18 -07:00
David S. Miller
59db8102bd [SPARC64]: Don't be picky about virtual-dma values on sun4v.
Handle arbitrary base and length values as long as they
are multiples of IO_PAGE_SIZE.

Bug found by Arun Kumar Rao.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-29 02:49:15 -07:00
Sam Ravnborg
ca967258b6 all-archs: consolidate .data section definition in asm-generic
With this consolidation we can now modify the .data
section definition in one spot for all archs.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-19 09:11:57 +02:00
Sam Ravnborg
7664709b44 all-archs: consolidate .text section definition in asm-generic
Move definition of .text section to asm-generic.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-05-19 09:11:57 +02:00
David S. Miller
03983ab858 [SPARC64]: Fix sched_clock() et al.
SPARC64_NSEC_PER_CYC_SHIFT was set too high.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-17 22:55:26 -07:00
David S. Miller
c7754d465b [SPARC64]: Add hypervisor API negotiation and fix console bugs.
Hypervisor interfaces need to be negotiated in order to use
some API calls reliably.  So add a small set of interfaces
to request API versions and query current settings.

This allows us to fix some bugs in the hypervisor console:

1) If we can negotiate API group CORE of at least major 1
   minor 1 we can use con_read and con_write which can improve
   console performance quite a bit.

2) When we do a console write request, we should hold the
   spinlock around the whole request, not a byte at a time.
   What would happen is that it's easy for output from
   different cpus to get mixed with each other.

3) Use consistent udelay() based polling, udelay(1) each
   loop with a limit of 1000 polls to handle stuck hypervisor
   console.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-15 20:23:02 -07:00
David S. Miller
17f34f0ec9 [SPARC64]: Add missing cpus_empty() check in hypervisor xcall handling.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-14 02:01:52 -07:00
David S. Miller
49d23cfcec [SPARC64]: Be more resiliant with PCI I/O space regs.
If we miss on the ranges, just toss the translation up to the parent
instead of failing.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-13 22:01:18 -07:00
David S. Miller
8354c5b726 [SPARC]: Wire up signalfd/timerfd/eventfd syscalls.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 22:06:51 -07:00
David S. Miller
d037e0532e [SPARC64]: Add support for bq4802 TOD chip, as found on ultra45.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:27 -07:00
David S. Miller
95d71e663e [SPARC64]: Correct FIRE_IOMMU_FLUSHINV register offset.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:26 -07:00
David S. Miller
f16537bac7 [SPARC64]: pci_resource_adjust() cannot be __init.
Noticed by Meelis Roos.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:22 -07:00
Simon Arlott
e5dd42e4fb [SPARC64]: Spelling fixes.
Spelling fixes in arch/sparc64/.

Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-11 21:39:21 -07:00
Amy Griffis
e54dc2431d [PATCH] audit signal recipients
When auditing syscalls that send signals, log the pid and security
context for each target process. Optimize the data collection by
adding a counter for signal-related rules, and avoiding allocating an
aux struct unless we have more than one target process. For process
groups, collect pid/context data in blocks of 16. Move the
audit_signal_info() hook up in check_kill_permission() so we audit
attempts where permission is denied.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-05-11 05:38:25 -04:00
Amy Griffis
7f13da40e3 [PATCH] add SIGNAL syscall class (v3)
Add a syscall class for sending signals.

Signed-off-by: Amy Griffis <amy.griffis@hp.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2007-05-11 05:38:25 -04:00
Linus Torvalds
0ab598099c Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Use alloc_pci_dev() in PCI bus probes.
  [SPARC64]: Bump PROMINTR_MAX to 32.
  [SPARC64]: Fix recursion in PROM tree building.
  [SERIAL] sunzilog: Interrupt enable before ISR handler installed
  [SPARC64] PCI: Consolidate PCI access code into pci_common.c
2007-05-10 13:32:05 -07:00
David S. Miller
26e6385f14 [SPARC64]: Use alloc_pci_dev() in PCI bus probes.
Otherwise MSI explodes because pci_msi_init_pci_dev() does not
get invoked.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-10 02:16:27 -07:00
David S. Miller
aa5242e78f [SPARC64]: Fix recursion in PROM tree building.
Use iteration for scanning of PROM node siblings.

Based upon a patch by Greg Onufer, who found this bug.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-10 00:53:29 -07:00
Fernando Luis Vazquez Cao
2f4dfe206a Remove hardcoding of hard_smp_processor_id on UP systems
With the advent of kdump, the assumption that the boot CPU when booting an UP
kernel is always the CPU with a particular hardware ID (often 0) (usually
referred to as BSP on some architectures) is not valid anymore.  The reason
being that the dump capture kernel boots on the crashed CPU (the CPU that
invoked crash_kexec), which may be or may not be that particular CPU.

Move definition of hard_smp_processor_id for the UP case to
architecture-specific code ("asm/smp.h") where it belongs, so that each
architecture can provide its own implementation.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Acked-by: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-09 12:30:48 -07:00
David S. Miller
ca3dd88e41 [SPARC64] PCI: Consolidate PCI access code into pci_common.c
All the sun4u controllers do the same thing to compute the physical
I/O address to poke, and we can move the sun4v code into this common
location too.

This one needs a bit of testing, in particular the Sabre code had some
funny stuff that would break up u16 and/or u32 accesses into pieces
and I didn't think that was needed any more.  If it is we need to find
out why and add back code to do it again.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-09 02:35:27 -07:00
David S. Miller
127cda1e8c [SPARC64]: Optimize fault kprobe handling just like powerpc.
And eliminate DIE_GPF while we're at it.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 18:25:14 -07:00
David S. Miller
6c1142602c [SPARC]: Wire up utimensat syscall.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 17:50:14 -07:00
David S. Miller
af80318eb7 [SPARC64]: Fix request_irq() ignored result warnings in PCI controller code.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 17:23:31 -07:00
David S. Miller
c57c2ffb15 [SPARC64]: Kill asm-sparc64/pbm.h
Everything it contains can be hidden in pci_impl.h

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:43:08 -07:00
David S. Miller
28113a9941 [SPARC64]: Removal of trivial pci_controller_info uses.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:44 -07:00
David S. Miller
6c108f1299 [SPARC64]: Move index info pci_pbm_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:40 -07:00
David S. Miller
e9870c4c0a [SPARC64]: Move {setup,teardown}_msi_irq into pci_pbm_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:36 -07:00
David S. Miller
f1cd8de2c9 [SPARC64]: Move pci_ops into pci_pbm_info.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:32 -07:00
David S. Miller
96a496fd49 [SPARC64] SBUS: Error interrupt registry cleanups.
Do not use IRQF_SHARED, these interrupt numbers should all
be unique.

Also use name strings without spaces in them just like
PCI controller drivers do, for consistency.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:28 -07:00
David S. Miller
34768bc832 [SPARC64] PCI: Use root list of pbm's instead of pci_controller_info's
The idea is to move more and more things into the pbm,
with the eventual goal of eliminating the pci_controller_info
entirely as there really isn't any need for it.

This stage of the transformations requires some reworking of
the PCI error interrupt handling.

It might be tricky to get rid of the pci_controller_info parenting for
a few reasons:

1) When we get an uncorrectable or correctable error we want
   to interrogate the IOMMU and streaming cache of both
   PBMs for error status.  These errors come from the UPA
   front-end which is shared between the two PBM PCI bus
   segments.

   Historically speaking this is why I choose the datastructure
   hierarchy of pci_controller_info-->pci_pbm_info

2) The probing does a portid/devhandle match to look for the
   'other' pbm, but this is entirely an artifact and can be
   eliminated trivially.

What we could do to solve #1 is to have a "buddy" pointer from one pbm
to another.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:24 -07:00
David S. Miller
cfa0652c4e [SPARC64] PCI: Use common routine to fetch PBM properties.
Namely bus-range and ino-bitmap.

This allows us also to eliminate pci_controller_info's
pci_{first,last}_busno fields as only the pbm ones are
used now.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-08 16:41:12 -07:00
Ulrich Drepper
1c710c896e utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it

a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
   of the BSD lutimes(3) functions

For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.

Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.

Also, the completely missing futimensat() functionality is added.  We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).

Test application (the syscall number will need per-arch editing):

#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>

#define __NR_utimensat 280

#define UTIME_NOW       ((1l << 30) - 1l)
#define UTIME_OMIT      ((1l << 30) - 2l)

int
main(void)
{
  int status = 0;

  int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
  if (fd == -1)
    error (1, errno, "failed to create test file \"ttt\"");

  struct stat64 st1;
  if (fstat64 (fd, &st1) != 0)
    error (1, errno, "fstat failed");

  struct timespec t[2];
  t[0].tv_sec = 0;
  t[0].tv_nsec = 0;
  t[1].tv_sec = 0;
  t[1].tv_nsec = 0;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  struct stat64 st2;
  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
    {
      puts ("atim not reset to zero");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("mtim not reset to zero");
      status = 1;
    }
  if (status != 0)
    goto out;

  t[0] = st1.st_atim;
  t[1].tv_sec = 0;
  t[1].tv_nsec = UTIME_OMIT;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
      || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
    {
      puts ("atim not set");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("mtim changed from zero");
      status = 1;
    }
  if (status != 0)
    goto out;

  t[0].tv_sec = 0;
  t[0].tv_nsec = UTIME_OMIT;
  t[1] = st1.st_mtim;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
      || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
    {
      puts ("mtim changed from original time");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
      || st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
    {
      puts ("mtim not set");
      status = 1;
    }
  if (status != 0)
    goto out;

  sleep (2);

  t[0].tv_sec = 0;
  t[0].tv_nsec = UTIME_NOW;
  t[1].tv_sec = 0;
  t[1].tv_nsec = UTIME_NOW;
  if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  struct timeval tv;
  gettimeofday(&tv,NULL);

  if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
      || st2.st_atim.tv_sec > tv.tv_sec)
    {
      puts ("atim not set to NOW");
      status = 1;
    }
  if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
      || st2.st_mtim.tv_sec > tv.tv_sec)
    {
      puts ("mtim not set to NOW");
      status = 1;
    }

  if (symlink ("ttt", "tttsym") != 0)
    error (1, errno, "cannot create symlink");

  t[0].tv_sec = 0;
  t[0].tv_nsec = 0;
  t[1].tv_sec = 0;
  t[1].tv_nsec = 0;
  if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
    error (1, errno, "utimensat failed");

  if (lstat64 ("tttsym", &st2) != 0)
    error (1, errno, "lstat failed");

  if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
    {
      puts ("symlink atim not reset to zero");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("symlink mtim not reset to zero");
      status = 1;
    }
  if (status != 0)
    goto out;

  t[0].tv_sec = 1;
  t[0].tv_nsec = 0;
  t[1].tv_sec = 1;
  t[1].tv_nsec = 0;
  if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
    error (1, errno, "utimensat failed");

  if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

  if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
    {
      puts ("atim not reset to one");
      status = 1;
    }
  if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
    {
      puts ("mtim not reset to one");
      status = 1;
    }

  if (status == 0)
     puts ("all OK");

 out:
  close (fd);
  unlink ("ttt");
  unlink ("tttsym");

  return status;
}

[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:18 -07:00
Randy Dunlap
e63340ae6b header cleaning: don't include smp_lock.h when not used
Remove includes of <linux/smp_lock.h> where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:07 -07:00
Christoph Hellwig
1eeb66a1bb move die notifier handling to common code
This patch moves the die notifier handling to common code.  Previous
various architectures had exactly the same code for it.  Note that the new
code is compiled unconditionally, this should be understood as an appel to
the other architecture maintainer to implement support for it aswell (aka
sprinkling a notify_die or two in the proper place)

arm had a notifiy_die that did something totally different, I renamed it to
arm_notify_die as part of the patch and made it static to the file it's
declared and used at.  avr32 used to pass slightly less information through
this interface and I brought it into line with the other architectures.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
[bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: <linux-arch@vger.kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Bryan Wu <bryan.wu@analog.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:15:04 -07:00
Christoph Hellwig
ab1b6f03a1 simplify the stacktrace code
Simplify the stacktrace code:

 - remove the unused task argument to save_stack_trace, it's always
   current
 - remove the all_contexts flag, it's alwasy 0

Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@suse.de>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 11:14:58 -07:00
David S. Miller
c35a376d60 [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/prom.c
The IRQ translation init routines should all be __init.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:02:24 -07:00
David S. Miller
a6009dda97 [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/pci.c
apb_calc_first_last(), apb_fake_ranges(), pci_of_scan_bus(),
of_scan_pci_bridge(), pci_of_scan_bus(), and pci_scan_one_pbm()
should all be __devinit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:01:38 -07:00
David S. Miller
23abc9ec6a [SPARC64]: Fix section mismatch warnings in arch/sparc64/kernel/console.c
probe_other_fhcs() and central_probe() should be __init

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-07 00:00:37 -07:00
David S. Miller
861fe90656 [SPARC64]: SUN4U PCI-E controller support.
Some minor refactoring in the generic code was necessary for
this:

1) This controller requires 8-byte access to the interrupt map
   and clear register.  They are 64-bits on all the other
   SBUS and PCI controllers anyways, so this was easy to cure.

2) The IMAP register has a different layout and some bits that we
   need to preserve, so use a read/modify/write when making
   changes to the IMAP register in generic code.

3) Flushing the entire IOMMU TLB is best done with a single write
   to a register on this PCI controller, add a iommu->iommu_flushinv
   for this.

Still lacks MSI support, that will come later.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-05-06 22:44:06 -07:00
Linus Torvalds
ea62ccd00f Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (231 commits)
  [PATCH] i386: Don't delete cpu_devs data to identify different x86 types in late_initcall
  [PATCH] i386: type may be unused
  [PATCH] i386: Some additional chipset register values validation.
  [PATCH] i386: Add missing !X86_PAE dependincy to the 2G/2G split.
  [PATCH] x86-64: Don't exclude asm-offsets.c in Documentation/dontdiff
  [PATCH] i386: avoid redundant preempt_disable in __unlazy_fpu
  [PATCH] i386: white space fixes in i387.h
  [PATCH] i386: Drop noisy e820 debugging printks
  [PATCH] x86-64: Fix allnoconfig error in genapic_flat.c
  [PATCH] x86-64: Shut up warnings for vfat compat ioctls on other file systems
  [PATCH] x86-64: Share identical video.S between i386 and x86-64
  [PATCH] x86-64: Remove CONFIG_REORDER
  [PATCH] x86-64: Print type and size correctly for unknown compat ioctls
  [PATCH] i386: Remove copy_*_user BUG_ONs for (size < 0)
  [PATCH] i386: Little cleanups in smpboot.c
  [PATCH] x86-64: Don't enable NUMA for a single node in K8 NUMA scanning
  [PATCH] x86: Use RDTSCP for synchronous get_cycles if possible
  [PATCH] i386: Add X86_FEATURE_RDTSCP
  [PATCH] i386: Implement X86_FEATURE_SYNC_RDTSC on i386
  [PATCH] i386: Implement alternative_io for i386
  ...

Fix up trivial conflict in include/linux/highmem.h manually.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-05 14:55:20 -07:00
Michael Ellerman
7fe3730de7 MSI: arch must connect the irq and the msi_desc
set_irq_msi() currently connects an irq_desc to an msi_desc. The archs call
it at some point in their setup routine, and then the generic code sets up the
reverse mapping from the msi_desc back to the irq.

set_irq_msi() should do both connections, making it the one and only call
required to connect an irq with it's MSI desc and vice versa.

The arch code MUST call set_irq_msi(), and it must do so only once it's sure
it's not going to fail the irq allocation.

Given that there's no need for the arch to return the irq anymore, the return
value from the arch setup routine just becomes 0 for success and anything else
for failure.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-05-02 19:02:38 -07:00
Jeremy Fitzhardinge
b6e3590f81 [PATCH] x86: Allow percpu variables to be page-aligned
Let's allow page-alignment in general for per-cpu data (wanted by Xen, and
Ingo suggested KVM as well).

Because larger alignments can use more room, we increase the max per-cpu
memory to 64k rather than 32k: it's getting a little tight.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2007-05-02 19:27:12 +02:00
David S. Miller
16ce82d846 [SPARC64]: Convert PCI over to generic struct iommu/strbuf.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 21:08:21 -07:00
David S. Miller
3e4d26508a [SPARC64]: Convert SBUS over to generic iommu/strbuf structs.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:44 -07:00
David S. Miller
9b3627f389 [SPARC64]: Consolidate {sbus,pci}_iommu_arena.
Move to asm-sparc64/iommu.h and rename to plain "iommu_arena".

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:42 -07:00
Stephen Rothwell
3dfe10ee7c [SPARC64]: constify some paramaters of OF routines
This starts bringing the PowerPC and Sparc64 implemetations back closer
together.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:40 -07:00
David S. Miller
a165b4205e [SPARC64]: Fix PCI rework to adhere to of_get_property() const return.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:37 -07:00
David S. Miller
0f3e25049e [SPARC64]: Make sure pbm->prom_node is setup easly enough in psycho.c
It needs to be ready before we invoke pci_determine_mem_io_space().

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:35 -07:00
David S. Miller
28f57e774d [SPARC64]: Force dummy host controller onto bus zero.
This helps deal with the invisible bridge that sits between
the host controller and the top-most visisble PCI devices
on hypervisor systems.

For example, on T1000 the bus-range property says 2 --> 4
and so there is a PCI express bridge at bus 2, devfn 0, etc.

So if we don't force the dummy host controller to bus zero,
we'll try to create two devices with the same domain/bus/devfn
triplet.

Also, add some more log diagnostics to make debugging stuff like this
easyer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:20 -07:00
David S. Miller
97b3cf050b [SPARC64]: Add dummy host controller to root of all PCI domains.
We fake up a dummy one in all cases because that is the simplest
thing to do and it happens to be necessary for hypervisor systems.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:19 -07:00
David S. Miller
c6e87566ea [SPARC64]: Const'ify pci_iommu_ops.
Based upon a similar patch for x86_64 written by
Stephen Hemminger.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:18 -07:00
David S. Miller
0bba2dd823 [SPARC64]: Kill pbm->pci_first_slot.
Set but never used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:17 -07:00
David S. Miller
3875c5c02d [SPARC64]: Kill pci_controller->pbms_same_domain
We don't do the "Simba APB is a PBM" bogosity for Sabre
controllers any longer, so this pbms_same_domain thing
is no longer necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:16 -07:00
David S. Miller
8d3aee9375 [SPARC64]: Kill pci_controller->base_address_update().
Implemented but never actually used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:15 -07:00
David S. Miller
0bae5f81b6 [SPARC64]: Kill pci_controller->resource_adjust()
All the implementations can be identical and generic, so
no need for controller specific methods.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:14 -07:00
David S. Miller
3487a1f9e7 [SPARC64]: Kill PBM ranges software state.
It is only used in one spot and we can just fetch the
OF property right there.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:13 -07:00
David S. Miller
229177c7f3 [SPARC64]: Kill PBM intmap software state.
Set but never used.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:12 -07:00
David S. Miller
9fd8b64761 [SPARC64]: Consolidate PCI mem/io resource determination.
It can be done for every PCI configuration using OF properties.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:11 -07:00
David S. Miller
01f94c4a6c [SPARC64]: Fix sabre pci controllers with new probing scheme.
The SIMBA APB bridge is strange, it is a PCI bridge but it lacks
some standard OF properties, in particular it lacks a 'ranges'
property.

What you have to do is read the IO and MEM range registers in
the APB bridge to determine the ranges handled by each bridge.
So fill in the bus resources by doing that.

Since we now handle this quirk in the generic PCI and OF device
probing layers, we can flat out eliminate all of that code from
the sabre pci controller driver.

In fact we can thus eliminate completely another quirk of the sabre
driver.  It tried to make the two APB bridges look like PBMs but that
makes zero sense now (and it's questionable whether it ever made sense).
So now just use pbm_A and probe the whole PCI hierarchy using that as
the root.

This simplification allows many future cleanups to occur.

Also, I've found yet another quirk that needs to be worked around
while testing this.  You can't use the 'class-code' OF firmware
property, especially for IDE controllers.  We have to read the value
out of PCI config space or else we'll see the value the device was
showing before it was programmed into native mode.

I'm starting to think it might be wise to just read all of the values
out of PCI config space instead of using the OF properties. :-/

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:10 -07:00
David S. Miller
a378fd0ee8 [SPARC64]: Fix obppath pci device sysfs creation.
Need to traverse recursively down child busses else we only
get the file created under devices at the top-level.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:09 -07:00
David S. Miller
bc606f3c91 [SPARC64]: Minor cleanups to schizo pci controller driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:08 -07:00
David S. Miller
1e8a8cc52d [SPARC64]: Internalize pci_memspace_mask.
The only user was bus_dvma_to_mem() which is no longer used
by any driver, so kill that, and the export of pci_memspace_mask.

The only user now is the PCI mmap support code.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:07 -07:00
David S. Miller
a2fb23af1c [SPARC64]: Probe PCI bus using OF device tree.
Almost entirely taken from the 64-bit PowerPC PCI code.

This allowed to eliminate a ton of cruft from the sparc64
PCI layer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-26 01:55:06 -07:00