Commit Graph

490 Commits

Author SHA1 Message Date
Martin Hicks
753ee72896 [PATCH] VM: early zone reclaim
This is the core of the (much simplified) early reclaim.  The goal of this
patch is to reclaim some easily-freed pages from a zone before falling back
onto another zone.

One of the major uses of this is NUMA machines.  With the default allocator
behavior the allocator would look for memory in another zone, which might be
off-node, before trying to reclaim from the current zone.

This adds a zone tuneable to enable early zone reclaim.  It is selected on a
per-zone basis and is turned on/off via syscall.

Adding some extra throttling on the reclaim was also required (patch
4/4).  Without the machine would grind to a crawl when doing a "make -j"
kernel build.  Even with this patch the System Time is higher on
average, but it seems tolerable.  Here are some numbers for kernbench
runs on a 2-node, 4cpu, 8Gig RAM Altix in the "make -j" run:

			wall  user   sys   %cpu  ctx sw.  sleeps
			----  ----   ---   ----   ------  ------
No patch		1009  1384   847   258   298170   504402
w/patch, no reclaim     880   1376   667   288   254064   396745
w/patch & reclaim       1079  1385   926   252   291625   548873

These numbers are the average of 2 runs of 3 "make -j" runs done right
after system boot.  Run-to-run variability for "make -j" is huge, so
these numbers aren't terribly useful except to seee that with reclaim
the benchmark still finishes in a reasonable amount of time.

I also looked at the NUMA hit/miss stats for the "make -j" runs and the
reclaim doesn't make any difference when the machine is thrashing away.

Doing a "make -j8" on a single node that is filled with page cache pages
takes 700 seconds with reclaim turned on and 735 seconds without reclaim
(due to remote memory accesses).

The simple zone_reclaim syscall program is at
http://www.bork.org/~mort/sgi/zone_reclaim.c

Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:14 -07:00
Ingo Molnar
39c715b717 [PATCH] smp_processor_id() cleanup
This patch implements a number of smp_processor_id() cleanup ideas that
Arjan van de Ven and I came up with.

The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
spaghetti was hard to follow both on the implementational and on the
usage side.

Some of the complexity arose from picking wrong names, some of the
complexity comes from the fact that not all architectures defined
__smp_processor_id.

In the new code, there are two externally visible symbols:

 - smp_processor_id(): debug variant.

 - raw_smp_processor_id(): nondebug variant. Replaces all existing
   uses of _smp_processor_id() and __smp_processor_id(). Defined
   by every SMP architecture in include/asm-*/smp.h.

There is one new internal symbol, dependent on DEBUG_PREEMPT:

 - debug_smp_processor_id(): internal debug variant, mapped to
                             smp_processor_id().

Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
lib/smp_processor_id.c file.  All related comments got updated and/or
clarified.

I have build/boot tested the following 8 .config combinations on x86:

 {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}

I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT.  (Other
architectures are untested, but should work just fine.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:13 -07:00
Suresh Siddha
84929801e1 [PATCH] x86_64: TASK_SIZE fixes for compatibility mode processes
Appended patch will setup compatibility mode TASK_SIZE properly.  This will
fix atleast three known bugs that can be encountered while running
compatibility mode apps.

a) A malicious 32bit app can have an elf section at 0xffffe000.  During
   exec of this app, we will have a memory leak as insert_vm_struct() is
   not checking for return value in syscall32_setup_pages() and thus not
   freeing the vma allocated for the vsyscall page.  And instead of exec
   failing (as it has addresses > TASK_SIZE), we were allowing it to
   succeed previously.

b) With a 32bit app, hugetlb_get_unmapped_area/arch_get_unmapped_area
   may return addresses beyond 32bits, ultimately causing corruption
   because of wrap-around and resulting in SEGFAULT, instead of returning
   ENOMEM.

c) 32bit app doing this below mmap will now fail.

  mmap((void *)(0xFFFFE000UL), 0x10000UL, PROT_READ|PROT_WRITE,
	MAP_FIXED|MAP_PRIVATE|MAP_ANON, 0, 0);

Signed-off-by: Zou Nan hai <nanhai.zou@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:12 -07:00
Andrew Morton
9a558cb4ec [PATCH] arm: irqs_disabled() type fix
kernel/sched.c: In function `__might_sleep':
kernel/sched.c:5461: warning: int format, long unsigned int arg (arg 3)

We expect irqs_disabled() to return an int (poor man's bool).

Acked-by: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-21 18:46:11 -07:00
Linus Torvalds
9723d95d10 Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6 2005-06-21 18:19:10 -07:00
David S. Miller
7049e6800f [SPARC64]: Add prefetch support.
The implementation is optimal for UltraSPARC-III and later.
It will work, however suboptimally, on UltraSPARC-II and
be treated as a NOP on UltraSPARC-I.

It is not worth code patching this thing as the highest cost
is the code space, and code patching cannot eliminate that.

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-21 16:20:28 -07:00
Patrick McHardy
18b8afc771 [NETFILTER]: Kill nf_debug
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-21 14:01:57 -07:00
Patrick McHardy
e45b1be8bc [NETFILTER]: Kill lockhelp.h
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-21 14:01:30 -07:00
Jamal Hadi Salim
0d51aa80a9 [IPV6]: V6 route events reported with wrong netlink PID and seq number
Essentially netlink at the moment always reports a pid and sequence of 0
always for v6 route activities. 
To understand the repurcassions of this look at:
http://lists.quagga.net/pipermail/quagga-dev/2005-June/003507.html

While fixing this, i took the liberty to resolve the outstanding issue
of IPV6 routes inserted via ioctls to have the correct pids as well.

This patch tries to behave as close as possible to the v4 routes i.e
maintains whatever PID the socket issuing the command owns as opposed to
the process. That made the patch a little bulky.

I have tested against both netlink derived utility to add/del routes as
well as ioctl derived one. The Quagga folks have tested against quagga.
This fixes the problem and so far hasnt been detected to introduce any
new issues.

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-21 13:51:04 -07:00
Alexey Kuznetsov
18b504e25f [NETLINK]: netlink_callback structure needs 5 args not 4
net/ipv4/tcp_diag.c uses up to ->args[4]

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-21 12:38:48 -07:00
Linus Torvalds
1d345dac1f Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6 2005-06-20 16:00:33 -07:00
Linus Torvalds
fb39588457 Merge rsync://rsync.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2005-06-20 15:58:58 -07:00
Maneesh Soni
988d186de5 [PATCH] sysfs-iattr: add sysfs_setattr
o This adds ->i_op->setattr VFS method for sysfs inodes. The changed
  attribues are saved in the persistent sysfs_dirent structure as a pointer
  to struct iattr. The struct iattr is allocated only for those sysfs_dirent's
  for which default attributes are getting changed. Thanks to Jon Smirl for
  this suggestion.

Signed-off-by: Maneesh Soni <maneesh@in.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:37 -07:00
Yani Ioannou
0a3e7eeabc [PATCH] I2C: add i2c sensor_device_attribute and macros
This patch creates a new header with a potential standard i2c sensor
attribute type (which simply includes an int representing the sensor
number/index) and the associated macros, SENSOR_DEVICE_ATTR to define
a static attribute and to_sensor_dev_attr to get a
sensor_device_attribute reference from an embedded device_attribute
reference.

Signed-off-by: Yani Ioannou <yani.ioannou@gmail.com>
2005-06-20 15:15:36 -07:00
Yani Ioannou
f2d03e1b3f [PATCH] Driver Core: include: update device attribute callbacks
Signed-off-by: Yani Ioannou <yani.ioannou@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:35 -07:00
Yani Ioannou
54b6f35c99 [PATCH] Driver core: change device_attribute callbacks
This patch adds the device_attribute paramerter to the
device_attribute store and show sysfs callback functions, and passes a
reference to the attribute when the callbacks are called.

Signed-off-by: Yani Ioannou <yani.ioannou@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:31 -07:00
Arnd Bergmann
acaefc25d2 [PATCH] libfs: add simple attribute files
Based on the discussion about spufs attributes, this is my suggestion
for a more generic attribute file support that can be used by both
debugfs and spufs.

Simple attribute files behave similarly to sequential files from
a kernel programmers perspective in that a standard set of file
operations is provided and only an open operation needs to
be written that registers file specific get() and set() functions.

These operations are defined as

void foo_set(void *data, u64 val); and
u64 foo_get(void *data);

where data is the inode->u.generic_ip pointer of the file and the
operations just need to make send of that pointer. The infrastructure
makes sure this works correctly with concurrent access and partial
read calls.

A macro named DEFINE_SIMPLE_ATTRIBUTE is provided to further simplify
using the attributes.

This patch already contains the changes for debugfs to use attributes
for its internal file operations.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:30 -07:00
Keiichiro Tokunaga
4b45099b75 [PATCH] Driver core: unregister_node() for hotplug use
This adds a generic function 'unregister_node()'.
It is used to remove objects of a node going away
for hotplug.  All the devices on the node must be
unregistered before calling this function.

Signed-off-by: Keiichiro Tokunaga <tokunaga.keiich@jp.fujitsu.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

diff -puN drivers/base/node.c~numa_hp_base drivers/base/node.c
2005-06-20 15:15:29 -07:00
Patrick Mochel
0d3e5a2e39 [PATCH] Driver Core: fix bk-driver-core kills ppc64
There's no check to see if the device is already bound to a driver, which
could do bad things.  The first thing to go wrong is that it will try to match
a driver with a device already bound to one.  In some cases (it appears with
USB with drivers/usb/core/usb.c::usb_match_id()), some drivers will match a
device based on the class type, so it would be common (especially for HID
devices) to match a device that is already bound.

The fun comes when ->probe() is called, it fails, then
driver_probe_device() does this:

	dev->driver = NULL;

Later on, that pointer could be be dereferenced without checking and cause
hell to break loose.

This problem could be nasty. It's very hardware dependent, since some
devices could have a different set of matching qualifiers than others.

Now, I don't quite see exactly where/how you were getting that crash.
You're dereferencing bad memory, but I'm not sure which pointer was bad
and where it came from, but it could have come from a couple of different
places.

The patch below will hopefully fix it all up for you. It's against
2.6.12-rc2-mm1, and does the following:

- Move logic to driver_probe_device() and comments uncommon returns:
  1 - If device is bound
  0 - If device not bound, and no error
  error - If there was an error.

- Move locking to caller of that function, since we want to lock a
  device for the entire time we're trying to bind it to a driver (to
  prevent against a driver being loaded at the same time).

- Update __device_attach() and __driver_attach() to do that locking.

- Check if device is already bound in __driver_attach()

- Update the converse device_release_driver() so it locks the device
  around all of the operations.

- Mark driver_probe_device() as static and remove export. It's an
  internal function, it should stay that way, and there are no other
  callers. If there is ever a need to export it, we can audit it as
  necessary.

Signed-off-by: Andrew Morton <akpm@osdl.org>
2005-06-20 15:15:27 -07:00
mochel@digitalimplant.org
36239577cf [PATCH] Use a klist for device child lists.
- Use klist iterator in device_for_each_child(), making it safe to use for
  removing devices.
- Remove unused list_to_dev() function.
- Kills all usage of devices_subsys.rwsem.

Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:23 -07:00
mochel@digitalimplant.org
63c4f204ff [PATCH] Remove struct device::driver_list.
Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:18 -07:00
mochel@digitalimplant.org
7dc35cdf69 [PATCH] Remove struct device::bus_list.
Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:18 -07:00
mochel@digitalimplant.org
8b0c250be4 [PATCH] add klist_node_attached() to determine if a node is on a list or not.
Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

diff -Nru a/include/linux/klist.h b/include/linux/klist.h
2005-06-20 15:15:17 -07:00
mochel@digitalimplant.org
cb85b6f1cc [PATCH] Remove the unused device_find().
Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:16 -07:00
mochel@digitalimplant.org
94e7b1c5ff [PATCH] Add a klist to struct device_driver for the devices bound to it.
- Use it in driver_for_each_device() instead of the regular list_head and stop using
  the bus's rwsem for protection.
- Use driver_for_each_device() in driver_detach() so we don't deadlock on the
  bus's rwsem.
- Remove ->devices.
- Move klist access and sysfs link access out from under device's semaphore, since
  they're synchronized through other means.

Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:16 -07:00
mochel@digitalimplant.org
38fdac3cdc [PATCH] Add a klist to struct bus_type for its drivers.
- Use it in bus_for_each_drv().
- Use the klist spinlock instead of the bus rwsem.

Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:14 -07:00
mochel@digitalimplant.org
465c7a3a3a [PATCH] Add a klist to struct bus_type for its devices.
- Use it for bus_for_each_dev().
- Use the klist spinlock instead of the bus rwsem.

Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:14 -07:00
mochel@digitalimplant.org
9a19fea436 [PATCH] Add initial implementation of klist helpers.
This klist interface provides a couple of structures that wrap around
struct list_head to provide explicit list "head" (struct klist) and
list "node" (struct klist_node) objects. For struct klist, a spinlock
is included that protects access to the actual list itself. struct
klist_node provides a pointer to the klist that owns it and a kref
reference count that indicates the number of current users of that node
in the list.

The entire point is to provide an interface for iterating over a list
that is safe and allows for modification of the list during the
iteration (e.g. insertion and removal), including modification of the
current node on the list.

It works using a 3rd object type - struct klist_iter - that is declared
and initialized before an iteration. klist_next() is used to acquire the
next element in the list. It returns NULL if there are no more items.
This klist interface provides a couple of structures that wrap around
struct list_head to provide explicit list "head" (struct klist) and
list "node" (struct klist_node) objects. For struct klist, a spinlock
is included that protects access to the actual list itself. struct
klist_node provides a pointer to the klist that owns it and a kref
reference count that indicates the number of current users of that node
in the list.

The entire point is to provide an interface for iterating over a list
that is safe and allows for modification of the list during the
iteration (e.g. insertion and removal), including modification of the
current node on the list.

It works using a 3rd object type - struct klist_iter - that is declared
and initialized before an iteration. klist_next() is used to acquire the
next element in the list. It returns NULL if there are no more items.
Internally, that routine takes the klist's lock, decrements the reference
count of the previous klist_node and increments the count of the next
klist_node. It then drops the lock and returns.

There are primitives for adding and removing nodes to/from a klist.
When deleting, klist_del() will simply decrement the reference count.
Only when the count goes to 0 is the node removed from the list.
klist_remove() will try to delete the node from the list and block
until it is actually removed. This is useful for objects (like devices)
that have been removed from the system and must be freed (but must wait
until all accessors have finished).

Internally, that routine takes the klist's lock, decrements the reference
count of the previous klist_node and increments the count of the next
klist_node. It then drops the lock and returns.

There are primitives for adding and removing nodes to/from a klist.
When deleting, klist_del() will simply decrement the reference count.
Only when the count goes to 0 is the node removed from the list.
klist_remove() will try to delete the node from the list and block
until it is actually removed. This is useful for objects (like devices)
that have been removed from the system and must be freed (but must wait
until all accessors have finished).

Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

diff -Nru a/include/linux/klist.h b/include/linux/klist.h
2005-06-20 15:15:14 -07:00
mochel@digitalimplant.org
fae3cd0025 [PATCH] Add driver_for_each_device().
Now there's an iterator for accessing each device bound to a driver.

Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

Index: linux-2.6.12-rc2/drivers/base/driver.c
===================================================================
2005-06-20 15:15:13 -07:00
mochel@digitalimplant.org
af70316af1 [PATCH] Add a semaphore to struct device to synchronize calls to its driver.
This adds a per-device semaphore that is taken before every call from the core to a
driver method. This prevents e.g. simultaneous calls to the ->suspend() or ->resume()
and ->probe() or ->release(), potentially saving a whole lot of headaches.

It also moves us a step closer to removing the bus rwsem, since it protects the fields
in struct device that are modified by the core.

Signed-off-by: Patrick Mochel <mochel@digitalimplant.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:12 -07:00
gregkh@suse.de
cd987d38cc [PATCH] class: remove class_simple code, as no one in the tree is using it anymore.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:11 -07:00
gregkh@suse.de
8561b10f6e [PATCH] USB: move the usb hcd code to use the new class code.
This moves a kref into the main hcd structure, which detaches it from
the class device structure.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:07 -07:00
gregkh@suse.de
1235686f6e [PATCH] INPUT: move to use the new class code, instead of class_simple
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:04 -07:00
gregkh@suse.de
e9ba6365fd [PATCH] CLASS: move a "simple" class logic into the class core.
One step on improving the class api so that it can not be used incorrectly.
This also fixes the module owner issue with the dev files that happened when
the devt logic moved to the class core.

Based on a patch originally written by Kay Sievers <kay.sievers@vrfy.org>

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:04 -07:00
Dmitry Torokhov
d48593bf20 [PATCH] Make attributes names const char *
sysfs: make attributes and attribute_group's names const char *

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:01 -07:00
Dmitry Torokhov
8d790d7408 [PATCH] make driver's name be const char *
Driver core:
  change driver's, bus's, class's and platform device's names
  to be const char * so one can use
            const char *drv_name = "asdfg";
  when initializing structures.
  Also kill couple of whitespaces.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:01 -07:00
Dmitry Torokhov
419cab3fc6 [PATCH] kset_hotplug_ops->name shoudl return const char *
kobject: change name() method in kset_hotplug_ops return const char *
	 since users shoudl not try to modify returned data.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:01 -07:00
Dmitry Torokhov
f3b4f3c6de [PATCH] Make kobject's name be const char *
kobject: make kobject's name const char * since users should not
	 attempt to change it (except by calling kobject_rename).

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:00 -07:00
Dmitry Torokhov
e3a15db241 [PATCH] sysfs_{create|remove}_link should take const char *
sysfs: make sysfs_{create|remove}_link to take const char * name.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-06-20 15:15:00 -07:00
Robert Olsson
246955fe4c [NETLINK]: fib_lookup() via netlink
Below is a more generic patch to do fib_lookup via netlink. For others 
we should say that we discussed this as a way to verify route selection.
It's also possible there are others uses for this.

In short the fist half of struct fib_result_nl is filled in by caller 
and netlink call fills in the other half and returns it.

In case anyone is interested there is a corresponding user app to compare 
the full routing table this was used to test implementation of the LC-trie. 

Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-20 13:36:39 -07:00
Alexey Dobriyan
f6e276ee67 [ATALK]: endian annotations
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-20 13:32:05 -07:00
Alexey Dobriyan
f852640e74 [AX25]: endian-annotate ax25_type_trans()
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-20 13:31:11 -07:00
Herbert Xu
dd87147eed [IPSEC]: Add XFRM_STATE_NOPMTUDISC flag
This patch adds the flag XFRM_STATE_NOPMTUDISC for xfrm states.  It is
similar to the nopmtudisc on IPIP/GRE tunnels.  It only has an effect
on IPv4 tunnel mode states.  For these states, it will ensure that the
DF flag is always cleared.

This is primarily useful to work around ICMP blackholes.

In future this flag could also allow a larger MTU to be set within the
tunnel just like IPIP/GRE tunnels.  This could be useful for short haul
tunnels where temporary fragmentation outside the tunnel is desired over
smaller fragments inside the tunnel.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-20 13:21:43 -07:00
Herbert Xu
d094cd83c0 [IPSEC]: Add xfrm_state_afinfo->init_flags
This patch adds the xfrm_state_afinfo->init_flags hook which allows
each address family to perform any common initialisation that does
not require a corresponding destructor call.

It will be used subsequently to set the XFRM_STATE_NOPMTUDISC flag
in IPv4.

It also fixes up the error codes returned by xfrm_init_state.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-20 13:19:41 -07:00
Herbert Xu
72cb6962a9 [IPSEC]: Add xfrm_init_state
This patch adds xfrm_init_state which is simply a wrapper that calls
xfrm_get_type and subsequently x->type->init_state.  It also gets rid
of the unused args argument.

Abstracting it out allows us to add common initialisation code, e.g.,
to set family-specific flags.

The add_time setting in xfrm_user.c was deleted because it's already
set by xfrm_state_alloc.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: James Morris <jmorris@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-20 13:18:08 -07:00
Frank Filz
3f7a87d2fa [SCTP] sctp_connectx() API support
Implements sctp_connectx() as defined in the SCTP sockets API draft by
tunneling the request through a setsockopt().

Signed-off-by: Frank Filz <ffilzlnx@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2005-06-20 13:14:57 -07:00
Lennert Buytenhek
e4fe19819e [PATCH] ARM: 2701/1: free up ixp2000 timer 4 for the watchdog
Patch from Lennert Buytenhek

The IXP2000 has four timers, but if we're on an A-step IXP2800, timer
2 and 3 don't work.  We need two timers for timekeeping (one for the
timer interrupt and one for tracking missed jiffies), so on early
IXP2800s we have no other choice but to use timer 1 and 4 for that,
but on all other IXP2000s we'd rather leave timer 4 free since that's
the only timer we can use for the watchdog.
So, on buggy IXP2000s (i.e. the A-step IXP2800) we use timer 4 for
tracking missed jiffies, and on all all non-buggy IXP2000s (i.e.
everything but the A-step IXP2800) we use timer 2.
On a pre-production IXP2800, this patch should print these messages
on boot:
	Enabling IXP2800 erratum #25 workaround
	Unable to use IXP2000 watchdog due to IXP2800 erratum #25
On any non-buggy IXP2800 (as well as on IXP2400s) you shouldn't see
anything at all, and the watchdog should be usable again.

Signed-off-by: Lennert Buytenhek
Signed-off-by: Deepak Saxena
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-20 18:51:07 +01:00
Catalin Marinas
c0da085ad2 [PATCH] ARM: 2693/1: Add PCI support for Versatile/PB
Patch from Catalin Marinas

This patch adds PCI support for the Versatile PB926 platform.

Signed-off-by: Colin King
Signed-off-by: Catalin Marinas
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-20 18:51:06 +01:00
Bellido Nicolas
038c5b6025 [PATCH] ARM: 2686/2: AAEC-2000 Core support
Patch from Bellido Nicolas

Core support for AAEC-2000 based platforms.
This is an updated version of the previous patch, and takes
into account Russell's comments.
AAED-2000 default configuration will follow as soon
as some problems with the bootloader are sorted out...

Signed-off-by: Nicolas Bellido
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-20 18:51:05 +01:00
Russell King
09f0551d20 [PATCH] ARM: Add iomap support for ARM
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-20 18:44:37 +01:00